REVIEW ARTICLE

The Development and Progress in Machine Learning for Protein Subcellular Localization Prediction

The Open Bioinformatics Journal 06 Oct 2022 REVIEW ARTICLE DOI: 10.2174/18750362-v15-e2208110

Abstract

Protein subcellular localization is a novel and promising area and is defined as searching for the specific location of proteins inside the cell, such as in the nucleus, in the cytoplasm or on the cell membrane. With the rapid development of next-generation sequencing technology, more and more new protein sequences have been continuously discovered. It is no longer sufficient to merely use traditional wet experimental methods to predict the subcellular localization of these new proteins. Therefore, it is urgent to develop high-throughput computational methods to achieve quick and precise protein subcellular localization predictions. This review summarizes the development of prediction methods for protein subcellular localization over the past decades, expounds on the application of various machine learning methods in this field, and compares the properties and performance of various well-known predictors. The narrative of this review mainly revolves around three main types of methods, namely, the sequence-based methods, the knowledge-based methods, and the fusion methods. A special focus is on the gene ontology (GO)-based methods and the PLoc series methods. Finally, this review looks forward to the future development directions of protein subcellular localization prediction.

Keywords: Protein subcellular localization, Machine learning, Gene ontology, Deep learning, mGOASVM, PLoc-Deep-mHum.
Fulltext HTML PDF ePub
1800
1801
1802
1803
1804