Review
doi: 10.1016/j.sbi.2022.102518.
Epub 2023 Jan 3.
Affiliations
Item in Clipboard
Review
Curr Opin Struct Biol.
2023 Feb.
Abstract
Machine and deep learning approaches can leverage the increasingly available massive datasets of protein sequences, structures, and mutational effects to predict variants with improved fitness. Many different approaches are being developed, but systematic benchmarking studies indicate that even though the specifics of the machine learning algorithms matter, the more important constraint comes from the data availability and quality utilized during training. In cases where little experimental data are available, unsupervised and self-supervised pre-training with generic protein datasets can still perform well after subsequent refinement via hybrid or transfer learning approaches. Overall, recent progress in this field has been staggering, and machine learning approaches will likely play a major role in future breakthroughs in protein biochemistry and engineering.
Keywords:
Convolutional neural network; Deep learning; Hybrid learning; Mutational effect; Protein engineering; Transformer.
Copyright © 2022 Elsevier Ltd. All rights reserved.
Conflict of interest statement
Declaration of competing interest D.J.D. is a cofounder of Intelligent Proteins, LLC, which uses machine learning for problems of protein engineering. A.V.K., A.D.E., and C·O.W. declare that they have no conflict of interest.
Similar articles
-
Neural networks to learn protein sequence-function relationships from deep mutational scanning data.
Proc Natl Acad Sci U S A. 2021 Nov 30;118(48):e2104878118. doi: 10.1073/pnas.2104878118.
Proc Natl Acad Sci U S A. 2021.PMID: 34815338
Free PMC article. -
Deep neural network models for computational histopathology: A survey.
Med Image Anal. 2021 Jan;67:101813. doi: 10.1016/j.media.2020.101813. Epub 2020 Sep 25.
Med Image Anal. 2021.PMID: 33049577
Free PMC article.Review.
-
3-D Deconvolutional Networks for the Unsupervised Representation Learning of Human Motions.
IEEE Trans Cybern. 2022 Jan;52(1):398-410. doi: 10.1109/TCYB.2020.2973300. Epub 2022 Jan 11.
IEEE Trans Cybern. 2022.PMID: 32149670
-
Deep Dive into Machine Learning Models for Protein Engineering.
J Chem Inf Model. 2020 Jun 22;60(6):2773-2790. doi: 10.1021/acs.jcim.0c00073. Epub 2020 May 5.
J Chem Inf Model. 2020.PMID: 32250622
-
Unsupervised and self-supervised deep learning approaches for biomedical text mining.
Brief Bioinform. 2021 Mar 22;22(2):1592-1603. doi: 10.1093/bib/bbab016.
Brief Bioinform. 2021.PMID: 33569575
Review.
Cited by
-
Advances in ligand-specific biosensing for structurally similar molecules.
Cell Syst. 2023 Dec 20;14(12):1024-1043. doi: 10.1016/j.cels.2023.10.009.
Cell Syst. 2023.PMID: 38128482
Review.
-
Semantic search using protein large language models detects class II microcins in bacterial genomes.
bioRxiv. 2023 Nov 15:2023.11.15.567263. doi: 10.1101/2023.11.15.567263. Preprint.
bioRxiv. 2023.PMID: 38014091
Free PMC article. -
Ensemble Learning with Supervised Methods Based on Large-Scale Protein Language Models for Protein Mutation Effects Prediction.
Int J Mol Sci. 2023 Nov 18;24(22):16496. doi: 10.3390/ijms242216496.
Int J Mol Sci. 2023.PMID: 38003686
Free PMC article. -
Machine Learning-Guided Protein Engineering.
ACS Catal. 2023 Oct 13;13(21):13863-13895. doi: 10.1021/acscatal.3c02743. eCollection 2023 Nov 3.
ACS Catal. 2023.PMID: 37942269
Free PMC article.Review.
-
Two sequence- and two structure-based ML models have learned different aspects of protein biochemistry.
Sci Rep. 2023 Aug 16;13(1):13280. doi: 10.1038/s41598-023-40247-w.
Sci Rep. 2023.PMID: 37587128
Free PMC article.
References
-
-
Zuckerkandl E, Pauling L, Evolutionary divergence and convergence in proteins, in: Bryson V, Vogel HJ (Eds.), Evolving Genes and Proteins, Academic Press, 1965, pp. 97–166.
-
-
-
Tokuriki N, Tawfik DS, Stability effects of mutations and protein evolvability, Current Opin. Struct. Biol 19 (2009) 596–604.
–
PubMed
-