Performances of Bioinformatics Pipelines for the Identification of Pathogens in Clinical Samples with the De Novo Assembly Approaches: Focus on 2009 Pandemic Influenza A (H1N1)

Tommaso Biagini1, Barbara Bartolini2, Emanuela Giombini2, Maria R. Capobianchi2, Fabrizio Ferrè1, Giovanni Chillemi3, *, Alessandro Desideri1, *
1 Department of Biology, University of Rome "Tor Vergata", Via della Ricerca Scientifica, 00133, Rome, Italy and Molecular Digital Diagnostics (MDD), Via S. Camillo de Lellis 01100 Viterbo, Italy
2 “L. Spallanzani” National Institute for Infectious Diseases, Via Portuense 292, 00149 Rome, Italy
3 Inter-University Consortium for the Application of Super- Computing for Universities and Research-CASPUR, Via dei Tizii 6, Rome 00185, Italy

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 215
Abstract HTML Views: 360
PDF Downloads: 133
Total Views/Downloads: 708
Unique Statistics:

Full-Text HTML Views: 141
Abstract HTML Views: 197
PDF Downloads: 103
Total Views/Downloads: 441

© 2013 Biagini et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the Department of Biology, University of Rome "Tor Vergata", Via della Ricerca Scientifica 1, 00133, Roma Italia; Tel: +39.06.72594376; Fax +39.06.2022798; E-mails: and


Diagnostic assays for pathogen detection are critical components of public-health monitoring efforts. In view of the limitations of methods that target specific agents, new approaches are required for the identification of novel, modified or ‘unsuspected’ pathogens in public-health monitoring schemes. Metagenomic approach is an attractive possibility for rapid identification of these pathogens. The analysis of metagenomic libraries requires fast computation and appropriate algorithms to characterize sequences. In this paper, we compared the computational efficiency of different bioinformatic pipelines ad hoc established, based on de novo assembly of pathogen genomes, using a data set generated with a 454 genome sequencer from respiratory samples of patients with diagnosis of 2009 pandemic influenza A (H1N1). The results indicate high computational efficiency of the different bioinformatic pipelines, reducing the number of alignments respect to the identification based on the alignment of individual reads. The resulting computational time, added to the processing/sequencing time, is well compatible with diagnostic needs. The pipelines here described are useful in the unbiased analysis of clinical samples from patients with infectious diseases that may be relevant not only for the rapid identification but also for the extensive genetic characterization of viral pathogens without the need of culture amplification.

Keywords: Bioinformatics pipeline, de novo assembly, genome reconstruction, metagenomics, pathogen detection, pyrosequencing.