× close
Overview of participant selection and RF model performance. a, Participant and clinical information was extracted, filtered, and prepared for time points prior to the index time from the UCSF EHR and UCSF Memory and Aging Center (MAC) databases. All extracted clinical features were one-hot encoded and trained in a random forest (RF) model to predict future risk of AD diagnosis. The model was evaluated with a 30% holdout evaluation set to calculate AUROC/AUPRC and interpreted using a heterogeneous knowledge network (SPOKE) based on feature importance. The top features were then further validated in an external database. b. Filter a consistent set of ADs and controls from the UCSF EHR for model training and testing. The filtered participant cohort is shown in Table 1, split by her 30% holdout set for testing. c, Bootstrap performance of the RF model on the holdout evaluation set (n = 300 bootstrap iterations with 1,000 participants, AD prevalence on the holdout set = 0.003). Also shown is the bootstrapped AUROC performance of the model trained and tested on female and male layers. The boxes display the quartiles (25th, 50th, and 75th percentiles), the whiskers extend to 1.5 times the interquartile range, and the remaining points are outliers. credit: natural aging (2024). DOI: 10.1038/s43587-024-00573-8
Scientists at the University of California, San Francisco have discovered a way to predict Alzheimer's disease up to seven years before symptoms appear by using machine learning to analyze patient records.
The conditions most predictive of Alzheimer's disease were high cholesterol and, in women, osteoporosis, a disease that weakens bones.
This study demonstrates that artificial intelligence (AI) can be used to identify patterns in clinical data and use those patterns to sift through large genetic databases to determine what is causing the risk. doing. Researchers hope this will one day speed the diagnosis and treatment of Alzheimer's disease and other complex diseases.
“This is a first step toward using AI in routine clinical data to not only identify risks as early as possible, but also to understand the biology behind them,” said lead author of the study. said Alice Tan, MD/PhD. student in Shirota's lab at UCSF. “The power of this AI approach comes from identifying risks based on disease combinations.”
Here are the findings: natural aging.
The power of clinical data and predictions
Scientists have long sought to discover biological factors and early predictors of Alzheimer's disease, a progressive and ultimately fatal dementia that destroys memory. Alzheimer's disease affects approximately 6.7 million Americans, nearly two-thirds of whom are women. Although the risk of developing the disease increases with age and women tend to live longer than men, this alone does not fully explain why more women than men get the disease.
Using UCSF's clinical database of more than 5 million patients, researchers looked at comorbidities in patients diagnosed with Alzheimer's disease at UCSF's Memory and Aging Center compared to those without Alzheimer's disease. They found that they could identify who is who with 72% predictive power. You can develop this disease up to 7 years ago.
Several factors, such as high blood pressure, high cholesterol, and vitamin D deficiency, were predictive in both men and women. For men, erectile dysfunction and prostate enlargement were also predictable. However, for women, osteoporosis was a particularly important predictor.
This does not mean that everyone with the bone disease, which is common in older women, will develop Alzheimer's disease.
“It is the combination of diseases that allows our model to predict the onset of Alzheimer's disease. Our finding that osteoporosis is one of the predictors in women suggests that there is a biological link between bone health and dementia risk. It highlights the scientific interaction,” said Tan.
Precision medicine approach
To understand the biology underlying the model's predictive power, the researchers turned to public molecular databases and a specialized tool developed at UCSF called SPOKE (Scalable Precision Medicine Oriented Knowledge Engine). SPOKE (Scalable Precision Medicine Oriented Knowledge Engine) was developed in the laboratory of Dr. Sergio Baranzini. Professor of Neurology and member of the UCSF Weill Neuroscience Institute.
SPOKE is essentially a database of databases that researchers can use to identify therapeutic patterns and potential molecular targets. It addressed the well-known link between Alzheimer's disease and high cholesterol through APOE4, a variant of the apolipoprotein E gene. However, when combined with genetic databases, researchers also identified a link between osteoporosis and Alzheimer's disease in women through a variation in a little-known gene called MS4A6A.
Ultimately, the researchers hope this approach could be used for other difficult-to-diagnose diseases, such as lupus and endometriosis.
“This is a great example of how patient data can be used with machine learning to predict which patients are more likely to develop Alzheimer's disease and to understand why.” said Dr. Marina Sirota, lead author of . D., UCSF, where she is an associate professor at the Bakar Computational Health Sciences Institute.
For more information:
Alice S. Tan et al., Leveraging electronic health records and knowledge networks for Alzheimer's disease prediction and gender-specific biological insights, natural aging (2024). DOI: 10.1038/s43587-024-00573-8
Magazine information:
natural aging