Article Preview
TopIntroduction
Many biological processes are carried out by large assemblies of proteins such as the nuclear pore complex, and proteasomes, for example. Predicting the three-dimensional (3D) structures of these multi-component protein complexes can contribute towards a better understanding of their function, and can therefore provide a better picture of how biological systems work (Russell et al., 2004; Aloy & Russell, 2006). Although experimental methods such as X-ray crystallography can provide high resolution atomic details of individual proteins, for a number of practical reasons it is much more difficult to determine experimentally the atomic structures of multi-component protein assemblies (Alber et al., 2007). Therefore, efforts to model such structures computationally are becoming increasingly valuable.
To this end, several protein-protein docking algorithms have been described which aim to predict the 3D structures of complexes formed from pairs of proteins (Ritchie, 2008; Vakser & Kundrotas, 2008; Moreira et al., 2010). Other studies (Inbar et al., 2003; Inbar et al., 2005; Ben-Zeev et al., 2005; Lise et al., 2006; Wollacott et al., 2007; Cheng et al., 2008) have considered reconstructing small protein domains from their secondary structure elements (i.e., structurally stable protein fragments such as α-helices and β-sheets). For example, Lise et al. (2006) model interactions between protein domains that are part of the same protein chain. Their simulated annealing algorithm uses a contact map representation that predicts contacting and interacting residues across the domain-domain interface. Cheng et al. (2008) approach the same problem by applying pair-wise rigid-body docking of domains. Multi-body multi-stage docking protocols (viewed as a series of binary interactions) have also been used successfully to model conformational changes in some of the protein structures (Ben-Zeev et al., 2005) that were made available in the CAPRI blind docking experiment (Janin, 2005). Pair-wise docking approaches have also been used to model multi-domain protein assemblies (Inbar et al., 2003, 2005) by considering the assembly problem in a graph theoretical context.
Some docking algorithms have been proposed specifically to model multi-component assemblies (Inbar, Benyamini, Nussinov, & Wolfson, 2005; Pierce et al., 2005; Karaca et al., 2010; Mashiach-Farkash et al., 2011). However, most of these approaches are limited to assembling symmetric structures of two or more protein components. In these special cases, the assembly task may be guided using point-group symmetry constraints. To our knowledge, only the COMBDOCK algorithm (Inbar, Benyamini, Nussinov, & Wolfson, 2005) can predict the structures of non-symmetrical multi-component protein complexes. The COMBDOCK approach treats the assembly task as a combinatorial graph-theoretic search problem, and it uses greedy selection strategies to prune the search space.