RESEARCH ARTICLE


Sequence-Only Based Prediction of β -Turn Location and Type Using Collocation of Amino Acid Pairs



Kevin Campbell, Lukasz Kurgan*
Department of Electrical and Computed Engineering, University of Alberta, Canada


Article Metrics

CrossRef Citations:
5
Total Statistics:

Full-Text HTML Views: 951
Abstract HTML Views: 1178
PDF Downloads: 719
Total Views/Downloads: 2848
Unique Statistics:

Full-Text HTML Views: 536
Abstract HTML Views: 784
PDF Downloads: 514
Total Views/Downloads: 1834



Creative Commons License
© 2008 Campbell and Kurgan

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Electrical and Computed Engineering, University of Alberta, Edmonton, Alberta, Canada T6G 2V4; E-mail: lkurgan@ece.ualberta.ca


Abstract

Development of accurate β-turn (beta-turn) type prediction methods would contribute towards the prediction of the tertiary protein structure and would provide useful insights/inputs for the fold recognition and drug design. Only one existing sequence-only method is available for the prediction of beta-turn types (for type I and II) for the entire protein chains, while the proposed method allows for prediction of type I, II, IV, VII, and non-specific (NS) beta-turns, filling in the gap. The proposed predictor, which is based solely on protein sequence, is shown to provide similar performance to other sequence-only methods for prediction of beta-turns and beta-turn types. The main advantage of the proposed method is simplicity and interpretability of the underlying model. We developed novel sequence-based features that allow identifying beta-turns types and differentiating them from non-beta-turns. The features, which are based on tetrapeptides (entire beta-turns) rather than a window centered over the predicted residues as in the case of recent competing methods, provide a more biologically sound model. They include 12 features based on collocation of amino acid pairs, focusing on amino acids (Gly, Asp, and Asn) that are known to be predisposed to form beta-turns. At the same time, our model also includes features that are geared towards exclusion of non-beta-turns, which are based on amino acids known to be strongly detrimental to formation of beta-turns (Met, Ile, Leu, and Val).

Keywords: Secondary protein structure, Beta-turns, Beta-turn types, Prediction, Collocation of amino acid pairs, Support vector machine.