Non-repetitive Structures In Proteins : Effects Of Side-chain And Solvent Interactions With The Backbone

by Narayanan, Eswar

Abstract (Summary)
The work presented in this thesis deals with the analysis of protein crystal structures with an emphasis on the stereochemical aspects of the folded conformation of proteins. The various analyses described have been performed on a data-set of 250 high resolution and non-homologous protein structures derived from the Protein Data Bank. The overall objective of the work has been to analyse conformational features of the non-secondary structural regions in proteins and identify structural motifs present therein. The results can be useful in the three-dimensional modelling of proteins, altering the stability of proteins, design of peptide mimics and in understanding the structural rules that guide protein folding.

The contents of this thesis can be broadly classified into three parts, (a) Conformational preferences of amino acid residues to occur in the partially allowed regions of the Ramachandran map, (b) conformational features of structural motifs formed by side-chain/main-chain hydrogen bonds by polar residues and (c) analysis and characteristic features of isolated ?-strands.

Chapter 1 of the thesis gives an introduction, briefly discussing the conformation of polypeptide chains, structural features of globular proteins and applications of protein structural analysis etc.

Chapter 2 describes the occurrence of left-handed ?-helical conformation in protein structures. A data-set of 250 high resolution (< 2.0A) non-homologous protein crystal structures derived from the Protein Data Bank (PDB) has been analysed for occurrences of left-handed ?-helical (?L) conformations. A total of 2,573 ?L residues were identified from the data-set. About 59% of the observed examples of at conformations were found to be glycyl residues and about 41% non-glycyl. Continuous long stretches of ?L residues are seldom found in protein structures. They are most commonly found as singlets represented by 78% of the observed ?L examples. The doublets, triplets and quadruplets account for a very minor fraction of the observed examples. There is only a single example of a stretch of four contiguous

?L residues, from the protein thermolysin, which forms a single turn of a left-handed ?-helix.

A majority of the ?L residues are nevertheless part of well-defined substructures in proteins. They play singular roles as part of ?-turns and helix termination sites in maintaining the characteristic main-chain hydrogen bonds needed for the stability of these structures. They are also found to be effective in the termination of ?-strands. The stereo-chemistry and sequence environment around such structures are discussed. The analysis of the side-chain torsion angles of ?L residues indicate that the g+ rotamer is highly unfavourable due to stereo-chemical violations posed by the atoms of the side-chain with those of the backbone. The ?L residues are highly conserved by residue type as well as conformation among related proteins indicating their vital importance in protein structures

Chapter 3 provides an explanation for the unusual preference of glycyl residues to occur in the bridge regions of the Ramachandran map. The Ramachandran steric map and energy diagrams for the glycyl residue are fully symmetric. Though a plot of the (?,?) angles of glycyl residues derived from a data-set of 250 non-homologous and high-resolution protein structures is also largely symmetric, there is a clear aberration in the symmetry. While there is a cluster of points corresponding to the right-handed a-helical region, the "equivalent" cluster is shifted to centre around the (?,?)values of (90°, 0°) instead of being centred at the left-handed a-helical region of (60°, 40°).

An analysis of glycyl conformations in small peptide structures and in "coil" proteins, which are largely devoid of helical and sheet regions, shows that glycyl residues prefer to adopt conformations around (±90°, 0°) instead of right and left handed a-helical regions. Using theoretical calculations, such conformations are shown to have highest solvent accessibility in a system of two-linked peptide units with glycyl residue at the central C? atom. This is found to be consistent with the observations from 250 non-homologous protein structures where glycyl residues with conformations close to (±90°, 0°) are seen to have high solvent accessibility. Analysis of a sub-set of non-homologous structures with very high resolution (1.5A or better) shows that water molecules are indeed present at distances suitable for hydrogen bond interaction with glycyl residues possessing conformations close to (±90°, 0°). It is concluded that water molecules play a key role in determining and stabilising these conformations of glycyl residues and explains the aberration in the symmetry of glycyl conformations in proteins.

Chapter 4 discusses an analysis of backbone mimicry performed by polar side-chains

in protein structures. Backbonemimicry bythe formation of closed loop C7, C10, C13 (mimics of ?-, ?- and ?-turns) conformations through side-chain main-chain hydrogen bonds by polar groups is found to be a frequent observation in protein structures. A data-set of 250 non-homologous and high-resolution protein structures was used to analyse these conformations for their characteristic features. Seven out of the nine polar residues (Ser, Thr, Asn, Asp, Gin, Glu and His) have hydrogen bonding groups in their side-chains which can participate in such mimicry and as many as 15% of all these polar residues engage in such conformations. The distributions of dihedral angles of these mimics indicate that only certain combinations of the involved dihedral angles aids the formation of these mimics. The observed examples have been categorised into various classes based on these combinations resulting in well-defined motifs. Asn and Asp residues show a very high capability to perform such backbone secondary structural mimicry. The most highly mimicked backbone structure is of the Cio conformation by the Asx residues. The mimics formed by His, Ser, Thr and Glx residues are also discussed. The role of such conformations in initiating the formation of regular secondary structures during the course of protein folding seems significant.

Chapter 5 presents a description of deterministic features of side-chain main-chain hydrogen bonds as observed in protein structures. A total of 19,835 polar residues from the data set of 250 non-homologous and highly resolved protein crystal structures were used to identify side-chain main-chain (SC-MC) hydrogen bonds. The ratio of the total number of polar residues to the number of SC-MC hydrogen bonds is close to 2:1, indicating the ubiquitous nature of such hydrogen bonds. Close to 56% of the SC-MC hydrogen bonds are local involving side-chain acceptor/donor (‘i’) and a main-chain donor/acceptor within the window i-5 to i+5. These short-range hydrogen bonds form well defined conformational motifs characterised by specific combinations of backbone and side-chain torsion angles.

Some of the salient features of such hydrogen bonds are as follows, (a) The Ser/Thr residues show the greatest preference in forming intra-helical hydrogen bonds between the atoms Oyi and Oi-4 Such hydrogen bonds form motifs of the form ?R?R?R?R(g") and are most commonly observed at the middle of ?-helices. (b) These residues also show great preference to form hydrogen bonds between OYi and Oi-3, which are closely related to the previous type and though intra-helical, these hydrogen bonds are more often found at the C-termini of helices than at the middle. The motif represented by ?R?R?RaR(g+) is most preferred in these cases, (c) The Ser, Thr and Glu (between the side-chain and main-chain of the same residue), (d) The side-chain acceptor atoms of Asn/Asp and Ser/Thr residues show high preference to form hydrogen bonds with acceptors two residues ahead in the chain, which are characterised by the motifs ?(tt’)?R and ?(t)?R, respectively. These hydrogen bonded segments referred to as Asx turns, are known to provide stability to type I and type I’ ?-turns. (e) Ser/Thr residues often form a combination of SC-MC hydrogen bonds, with the side-chain donor hydrogen bonded to the carbonyl oxygen of its own peptide backbone and the side-chain acceptor hydrogen bonded to an amide hydrogen three residues ahead in the sequence. Such motifs are quite often seen at the beginning of a-helices, which are characterised by the ? (g+)?R?R motif.

A remarkable majority of all these hydrogen bonds are buried from the protein surface, away from the surrounding solvent. This strongly indicates the possibility of side-chains playing the role of the backbone, in the protein interiors, to satisfy the potential hydrogen bonding sites and maintaining the network of hydrogen bonds which is crucial to the structure of the protein.

Chapter 6 provides a detailed characterisation of isolated ?-strands. Reason for the formation of ?-strands in proteins is often associated with the formation of ? -sheets. However ?-strands, not part of ?-sheets, commonly occur in proteins. This raises questions about the structural role and stability of such isolated ?-strands. Using a data set consisting of 250 proteins, 518 isolated ?-strands have been identified from 187 proteins. The two important features that distinguish isolated ?-strands from p-strands occurring in ?-sheets are (i) the high preponderance of prolyl residues to occur in isolated ?-strands and (ii) their high solvent exposure. It is shown that the high propensity for proline residues to occur in isolated ?-strands is not due to the occurrence of polyproline type segments in the data-set. The propensities of other amino acids to occur in isolated ?-strands follows the same trend as those for ?-sheet forming ?-strands. Isolated ?-strands are characterised often by their main-chain amide and carbonyl groups involved in hydrogen bonding with polar side-chains or water. They are often flanked by irregular loop structures indicating that they are part of long of loops. Analysis of the conservation of such strands among families of homologous protein structures indicates that a sizeable fraction of them are highly conserved. It is suggested that though the formation of isolated ?-strands are driven by the intrinsic preferences of amino acid residues, they have many characteristics like loop segments but with repetitive (?,?) values falling within the ?-region of the Ramachandran map.

In addition of the material described in the six chapters above, the thesis also contains the details of work carried out on an aspect slightly different from the main theme of the thesis. This pertains to the comparative analysis of the members of a family of cytokine receptors to derive information to model new members of the family. The three dimensional modelling of the leptin receptor has been used as a case study and the details are included as an appendix.

Appendix describes the 3-dimensional model of the satiety factor receptor (the leptin receptor) modelled using principles of homology modelling. Recessive mutations in the mouse obese (ob) and diabetes (db) genes result in obesity and diabetes in a syndrome resembling human obesity. Data from parabiosis (cross circulation) experiments suggested that the ob gene coded, and was responsible for the generation of a circulating factor called leptin which regulated energy balance and the db gene encoded the receptor for this factor. While the structure of the leptin has been determined that of its cognate receptor is as yet unknown. The leptin receptor shows low but clear sequence similarity to the members of the interleukin type 6 family of receptors. The structures of the members of this family are characterised by two p-sandwich like domains connected by a short 4-residue helical linker. The 3-dimensional models for the N- and C-terminal domains of the leptin receptor was generated using the corresponding structures of the signal transducing component of gpl30, the erythropoetin receptor and the prolactin receptor. Further using the evidence that the leptin binds to its receptor with a stoichiometry of 1:1, the relative orientation of the two domains was modelled based on the structure of the human growth hormone receptor, which also binds its ligand with similar stoichiometry. The complex of leptin with its receptor was also modelled based on the structure of human growth hormone/receptor complex. The final energy minimised model of the complex elucidates the mode of interaction between the leptin and its receptor.

Bibliographical Information:

Advisor:Ramakrishnan, C

School:Indian Institute of Science

School Location:India

Source Type:Master's Thesis

Keywords:biochemistry protein structures proteins polypeptides backbone mimicry leptin receptor amino acid sequence


Date of Publication:04/01/2000

© 2009 All Rights Reserved.