Extraordinary Command Line: Basic Data Editing Tools for Biologists Dealing with Sequence Data

The Open Bioinformatics Journal 31 December 2020 LETTER DOI: 10.2174/1875036202013010137


The command line is a standard way of using the Linux operating system. It contains many features essential for efficiently handling data editing and analysis processes. Therefore, it is very useful in bioinformatics applications. Commands allow for rapid manipulation of large ASCII files or very numerous files, making basic command line programming skills a critical component in modern life science research. The following article is not a guide to Linux commands. In this manuscript, in contrast to many various Linux manuals, we aim to present basic command line tools helpful in handling biological sequence data. This manuscript provides a collection of simple and popular hacks dedicated to users with very basic experience in the area of the Linux command line. It includes a description of data formats and examples of editing of four types of data formats popular in bioinformatics applications.

Keywords: Bash, Command line, Data manipulation, DNA, Linux, Sequence data.
Fulltext HTML PDF