Performances of Bioinformatics Pipelines for the Identification of Pathogens in Clinical Samples with the De Novo Assembly Approaches: Focus on 2009 Pandemic Influenza A (H1N1)
Abstract
Diagnostic assays for pathogen detection are critical components of public-health monitoring efforts. In view of the limitations of methods that target specific agents, new approaches are required for the identification of novel, modified or ‘unsuspected’ pathogens in public-health monitoring schemes. Metagenomic approach is an attractive possibility for rapid identification of these pathogens. The analysis of metagenomic libraries requires fast computation and appropriate algorithms to characterize sequences. In this paper, we compared the computational efficiency of different bioinformatic pipelines ad hoc established, based on de novo assembly of pathogen genomes, using a data set generated with a 454 genome sequencer from respiratory samples of patients with diagnosis of 2009 pandemic influenza A (H1N1). The results indicate high computational efficiency of the different bioinformatic pipelines, reducing the number of alignments respect to the identification based on the alignment of individual reads. The resulting computational time, added to the processing/sequencing time, is well compatible with diagnostic needs. The pipelines here described are useful in the unbiased analysis of clinical samples from patients with infectious diseases that may be relevant not only for the rapid identification but also for the extensive genetic characterization of viral pathogens without the need of culture amplification.