DNA sequences derived from genomes and metagenomes encode a wealth of information about protein structure and function. However, because of the large number of available sequences, computational and statistical methods are necessary to infer biological meaning. Here, three approaches are explored which infer protein structure or function from microbial genomes and metagenomes. First, the host-pathogen interaction between human macrophages and Mycobacterium leprae is investigated. By comparing human functional lipase domains upregulated in lepromatous lesions with the genomic repertoires of several Mycobacteria, we find that host proteins may complement lipid-associated metabolic deficiencies of M. leprae. Second, function is inferred for protein families in an ocean metagenome by identifying conserved genomic neighbors with known functions. This approach correctly infers function for many well annotated proteins, and suggests high-confidence functions for several large novel protein families. Further scrutiny of the genomic neighbors reveals that many of the novel families are phage proteins, and many other phage protein families are of bacterial origin. Finally, the information contained in large protein families derived from genome and metagenome sequences is exploited to infer residue pairs that are in contact in the 3-dimensional structures of proteins. We integrate multiple lines of evidence via a Bayesian inference procedure to produce a posterior probability of contact for all residue pairs in a protein. We use these probabilistic predicted contacts to evaluate predicted 3D protein models, and find that models that best satisfy predicted contacts are those that are most similar to correct protein structures.... 2 University of Southern California, Los Angeles, California, United States of America, 3 Molecular Biology Institute, ... (GOS) revealed a high abundance of viral sequences, representing approximately 3% of the total predicted proteins.
|Title||:||Computational Inference of Protein Structure and Function from Microbial Genomes and Metagenomes|
|Publisher||:||ProQuest - 2008|