by Michael Fernández, Julio Caballero, Leyden Fernández, José Abreu, Gianco Acosta

Abstract:

This work reports a novel 3D pseudofolding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo-Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change (__MATH0__G) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over __MATH1__ in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence-based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin. 2007 Wiley-Liss, Inc.

View PDF

Reference:

Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines (Michael Fernández, Julio Caballero, Leyden Fernández, José Abreu, Gianco Acosta), In Proteins: Structure, Function and Genetics, volume 70, 2008. (http://www.scopus.com/inward/record.url?eid=2-s2.0-37349058879&partnerID=40&md5=4b7d6b5621bc1ab360cdc6ad4731b6bf http://onlinelibrary.wiley.com/doi/10.1002/prot.21524/abstract) (cited By (since 1996) 16)

Bibtex Entry:

@Article{Fernandez2008c,
Title = {Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines},
Author = {Michael Fernández and Julio Caballero and Leyden Fernández and José Abreu and Gianco Acosta},
Journal = {Proteins: Structure, Function and Genetics},
Year = {2008},
Note = {cited By (since 1996) 16},
Number = {1},
Pages = {167-175},
Volume = {70},
Abstract = {This work reports a novel 3D pseudofolding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo-Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change ($deltha deltha $G) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over $70%$ in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence-based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin. 2007 Wiley-Liss, Inc.},
Affiliation = {Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba; Centro de Bioinformática Y Simulación Molecular, Universidad de Talca, 2 Norte 685, Casilla 721, Talca, Chile; Artificial Intelligence Lab., Faculty of Informatics, University of Matanzas, 44740 Matanzas, Cuba; National Bioinformatics Center, 10200, Havana, Cuba},
Author_keywords = {Graph similarity; Kernel-based methods; Point mutations; Protein stability prediction},
Comment = {http://www.scopus.com/inward/record.url?eid=2-s2.0-37349058879&partnerID=40&md5=4b7d6b5621bc1ab360cdc6ad4731b6bf http://onlinelibrary.wiley.com/doi/10.1002/prot.21524/abstract},
Document_type = {Article},
Doi = {http://dx.doi.org/10.1002/prot.21524},
Owner = {2008_Proteins_70_167},
Source = {Scopus},
Url = {http://www.ncbi.nlm.nih.gov/pubmed/17654549}
}