AI Speeds Rare Disease Diagnosis With Evolutionary Data
- 6 days ago
- 2 min read

Researchers from the Center for Genomic Regulation (CRG) in Barcelona and Harvard Medical School (HMS) have developed a sophisticated artificial intelligence (AI) model called popEVE. This new tool is designed to support rare disease diagnosis by identifying which genetic mutations in human proteins are most likely to cause illness, even when those mutations have never been observed in a patient before.
The findings, published in Nature Genetics, introduce popEVE as a potential solution for the estimated one in two people with a rare disease who never receive a clear diagnosis.
PopEVE was constructed using a vast evolutionary record drawn from hundreds of thousands of different species. This "tree of life" data, combined with genetic variation information from large human databases like the UK Biobank and gnomAD, allows the model to learn which parts of the roughly 20,000 human proteins are critical for life and which can tolerate change.
The model represents an important advancement over its predecessor, EVE (Evolutionary model of Variant Effect), which was introduced in 2021. While EVE could classify mutations within a single gene, popEVE is the first model that can meaningfully rank the predicted severity of missense mutations across the entire human proteome. This means a variant in one gene can be directly compared to a variant in another on the same severity scale, allowing doctors to focus immediately on the potentially most damaging variants.
In terms of medical application, popEVE’s ability to detect disease-causing mutations is highly accurate. When validated using genetic data from more than 31,000 families affected by severe developmental disorders, popEVE correctly ranked the established causal mutation as the most damaging in 98% of cases. It also outperformed state-of-the-art competitors, including DeepMind's AlphaMissense.
Furthermore, the analysis uncovered 123 new candidate disease genes, many of which had been observed in just one or two patients, highlighting its utility in "one-off cases" where traditional methods fail due to a lack of case histories.
The tool makes diagnosis faster, simpler, and potentially cheaper, especially in healthcare systems with limited resources, because it can work with the patient's genetic information alone, without requiring access to parental DNA.
Crucially, popEVE also addresses a significant limitation in genetic tools: ancestry bias. Many existing tools incorrectly flag mutations as disease-causing simply because they haven't been seen in predominantly European-ancestry databases. PopEVE avoids this by treating all human variants equally, regardless of frequency in specific populations, thereby predicting fewer false positives. Researchers are currently working to make popEVE scores accessible to clinicians and integrate them into existing protein databases globally, and they believe the model may help identify new drug targets for genetic conditions.
🔖 Sources
Keywords: Rare Disease Diagnosis










Comments