Home About us MoEF Contact us Sitemap Tamil Website  
About Envis
Whats New
Microorganisms
Research on Microbes
Database
Bibliography
Publications
Library
E-Resources
Microbiology Experts
Events
Online Submission
mn

Site Visitors

blog tracking


 
Nucleic Acids Research
Vol.
40, No. 14, 2012; Pages: xxx - xxx

Rapid identification of high-confidence taxonomic assignments for metagenomic data

Norman J. MacDonald, Donovan H. Parks and Robert G. Beiko

Faculty of Computer Science, Dalhousie University, 6050 University Avenue, PO BOX 15000, Halifax, NS B3H 4R2, Canada.

Abstract

Determining the taxonomic lineage of DNA sequences is an important step in metagenomic analysis. Short DNA fragments from next-generation sequencing projects and microbes that lack close relatives in reference sequenced genome databases pose significant problems to taxonomic attribution methods. Our new classification algorithm, RITA (Rapid Identification of Taxonomic Assignments), uses the agreement between composition and homology to accurately classify sequences as short as 50nt in length by assigning them to different classification groups with varying degrees of confidence. RITA is much faster than the hybrid PhymmBL approach when comparable homology search algorithms are used, and achieves slightly better accuracy than PhymmBL on an artificial metagenome. RITA can also incorporate prior knowledge about taxonomic distributions to increase the accuracy of assignments in data sets with varying degrees of taxonomic novelty, and classified sequences with higher precision than the current best rank-flexible classifier. The accuracy on short reads can be increased by exploiting paired-end information, if available, which we demonstrate on a recently published bovine rumen data set. Finally, we develop a variant of RITA that incorporates accelerated homology search techniques, and generate predictions on a set of human gut metagenomes that were previously assigned to different ‘enterotypes’. RITA is freely available in Web server and standalone versions.

Keywords: taxonomic lineage of DNA sequences; Rapid Identification of Taxonomic Assignments; rank-flexible classifier.


 

 

 
Copyright © 2005 ENVIS Centre ! All rights reserved
This site is optimized for 1024 x 768 screen resolution