Baylor College of Medicine

Analytical tool predicts genes that can cause disease by producing altered proteins
Researchers at Baylor College of Medicine have developed an artificial intelligence (AI) model that reveals how protein modifications link genetic mutations to disease.

AI reveals how protein modifications link mutations to disease

Molly Chiu

713-798-4710

Houston, TX -
Content

Researchers at Baylor College of Medicine have developed an artificial intelligence (AI) model that reveals how protein modifications link genetic mutations to disease. The method, called DeepMVP and published in Nature Methods, significantly outperforms previously published models and has implications for the development of novel therapeutics.

“Proteins are responsible for all the functions of the body, from growing tissues to regulating metabolism or fighting disease. Their functions are often regulated by modifications that take place after proteins are made through a process called post-translational modification (PTM),” said corresponding author Dr. Bing Zhang, professor at the Lester and Sue Smith Breast Center and of molecular and human genetics at Baylor. He also is a McNair scholar and a member of Baylor’s Dan L Duncan Comprehensive Cancer Center.

The modifications include the addition of chemical groups, such as phosphates or sugars, that influence how a protein behaves, where it goes in the cell or how long it lasts. When PTMs go wrong, the proteins may not perform as expected and contribute to diseases like cancer, heart conditions or neurological disorders.

Understanding where PTMs happen can help predict how mutations in these locations may change a protein’s function in ways that affect a person’s health. For instance, PTMs can be disrupted by DNA mutations that can remove a PTM site in a protein, create a new site or affect nearby regions, altering the protein’s function.

“We developed DeepMVP, a computational model to predict where in a protein PTMs happen and which mutations in those locations can affect PTMs,” said co-first author Dr. Chenwei Wang, a postdoc in the Zhang lab. “To train the model to recognize patterns in protein sequences that indicate PTM sites, we created the PTMAtlas, a curated compendium of known 397,524 PTM sites generated through systematic reprocessing of 241 public datasets. We focused on six common PTMs.”

PTMAtlas includes nearly 400,000 PTM sites across thousands of human proteins. Compared to other databases, PTMAtlas is more comprehensive and accurate – it can predict PTM sites on all human proteins and even in viral proteins like those from SARS-CoV-2. This indicates that DeepMVP is a powerful resource for studying protein modifications.

DeepMVP outperformed eight existing similar tools. Testing its ability to predict how mutations affect PTM using a curated set of 235 known mutation-PTM pairs from scientific literature showed that DeepMVP correctly predicted the PTM site in 81% of cases and the direction of change (increase or decrease) in 97% of cases.

“We anticipate that DeepMVP can be applied to cancer, neurological conditions and cardiovascular diseases and accelerate discoveries in genetics, cancer biology and drug development,” Zhang said. “The tool is freely available to researchers worldwide at https://deepmvp.ptmax.org/.”

Other contributors to this work include co-first authors Bo Wen and Kai Li, Ping Han, Matthew V. Holt, Sara R. Savage, Jonathan T. Lei, Yongchao Dou, Zhiao Shi and Yi Li. All are affiliated with Baylor College of Medicine.

This study was supported by the National Cancer Institute (NCI) CPTAC award U24CA271076, the Cancer Prevention and Research Institutes of Texas award RR160027, funding from the McNair Medical Institute at the Robert and Janice McNair Foundation and NCI grant R01 CA271588. Additional support was provided by NVIDIA Corp. with the donation of the Titan XpGPU used for this research.

Back to topback-to-top