by Leyden Fernández, Julio Caballero, José Abreu, Michael Fernández

Abstract:

Development of novel computational approaches for modeling protein properties from their primary structure is the main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino acid sequence autocorrelation (AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex data base. A total of 720 AASA descriptors were tested for building predictive models of the change of thermal unfolding Gibbs free energy change (AAG) of gene V protein upon mutation. In this sense, ensembles of Bayesian-regularized genetic neural networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 66% variance of the data in training and test sets respectively. Furthermore, the optimum AASA vector subset not only helped to successfully model unfolding stability but also well distributed wild-type and gene V protein mutants on a stability self-organized map (SOM), when used for unsupervised training of competitive neurons. © 2007 Wiley-Liss, Inc.

View PDF

Reference:

Amino acid sequence autocorrelation vectors and Bayesian-regularized genetic neural networks for modeling protein conformational stability: Gene V protein mutants (Leyden Fernández, Julio Caballero, José Abreu, Michael Fernández), In Proteins: Structure, Function and Genetics, volume 67, 2007. (http://www.scopus.com/inward/record.url?eid=2-s2.0-34248549553&partnerID=40&md5=f66dc66c5168e61dc16eabf35cb28829 http://onlinelibrary.wiley.com/doi/10.1002/prot.21349/abstract) (cited By (since 1996) 28).

Bibtex Entry:

@Article{Fernandez2007,
Title = {Amino acid sequence autocorrelation vectors and Bayesian-regularized genetic neural networks for modeling protein conformational stability: Gene V protein mutants},
Author = {Leyden Fernández and Julio Caballero and José Abreu and Michael Fernández},
Journal = {Proteins: Structure, Function and Genetics},
Year = {2007},
Note = {cited By (since 1996) 28},
Number = {4},
Pages = {834-852},
Volume = {67},
Abstract = {Development of novel computational approaches for modeling protein properties from their primary structure is the main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino acid sequence autocorrelation (AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex data base. A total of 720 AASA descriptors were tested for building predictive models of the change of thermal unfolding Gibbs free energy change (AAG) of gene V protein upon mutation. In this sense, ensembles of Bayesian-regularized genetic neural networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 66% variance of the data in training and test sets respectively. Furthermore, the optimum AASA vector subset not only helped to successfully model unfolding stability but also well distributed wild-type and gene V protein mutants on a stability self-organized map (SOM), when used for unsupervised training of competitive neurons. © 2007 Wiley-Liss, Inc.},
Affiliation = {Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba; Artificial Intelligence Lab., Faculty of Informatics, University of Matanzas, 44740 Matanzas, Cuba},
Author_keywords = {Artificial neural networks; Bayesian regularization; Genetic algorithm; Point mutations; Protein stability prediction},
Comment = {http://www.scopus.com/inward/record.url?eid=2-s2.0-34248549553&partnerID=40&md5=f66dc66c5168e61dc16eabf35cb28829 http://onlinelibrary.wiley.com/doi/10.1002/prot.21349/abstract},
Document_type = {Article},
Doi = {http://dx.doi.org/10.1002/prot.21349},
Owner = {2007_Proteins_67_834},
Source = {Scopus},
Url = {http://www.ncbi.nlm.nih.gov/pubmed/17377990}
}