European scientists have created a GPT-based model that can predict the risk of more than a thousand diseases on par with single-disease tools and biomarker-based models [1].
Tell me the future!
For millennia, people have wanted to know what health events await them in the future. Models have been created that can forecast the onset of a single disease reasonably well. However, predicting several health outcomes simultaneously has been proven tricky. According to a new study published in Nature by an international group of researchers the immense power of AI can be harnessed to solve this problem.
The researchers created a model based on the GPT architecture, which powers chatbots such as ChatGPT. The model was fed data from UK Biobank, a huge repository of longitudinal health data on some half of a million British citizens.
The team notes that health events, biomarkers, and risk factors create an interconnected network that somewhat resembles language. Just like a large language model, such as GPT, predicts the next word, a model trained on health data can predict the next outcome. Age, sex, BMI, and risk factors such as tobacco use were also included in the model.
“Here we demonstrate,” the paper says, “that attention-based transformer models, similar to LLMs, can be extended to learn lifetime health trajectories and accurately predict future disease rates for more than 1,000 diseases simultaneously on the basis of previous health diagnoses, lifestyle factors and further informative data.”
Big results from big data
The model accepts the patient’s previous health history as a prompt. It then predicts the probability of the next health event in their life (for example, “pneumonia 18%, heart failure 9%, death 3%”, and so on) and the time to that event, with accuracy comparable to that of single-disease tools. Consequently, it can “generate entire future health trajectories,” study co-author Moritz Gerstung, a data scientist at the German Cancer Research Center in Heidelberg, said to Nature. “A health care professional would have to run dozens of them to deliver a comprehensive answer,” he added.
The model, called Delphi-2M (after the two million parameters it uses), also generally outperformed a multi-disease predictor trained on 67 UK Biobank biomarkers. However, its predictive power trailed some strong lab markers, such as HbA1c for diabetes, underscoring the value of biomarkers for some endpoints.
The model was validated and tested on UK Biobank data that was not used for training. Then, it was also tested on a separate Danish dataset of about 1.9 million health trajectories, where it showed only slightly reduced accuracy.
Delphi-2M can be used not only for flagging potential health concerns in individuals but also for populational modeling. For instance, it can model thousands of health trajectories for a given region and demographic mix, producing forward-looking estimates of incidence, hospitalizations, deaths, and years lived with disease. Because it preserves competing risks, it also enables ‘what-if’ analysis; for instance, it can estimate the gain in average life expectancy from eliminating cancer.
Delphi-2M, named after the legendary Greek oracle, cannot actually predict death with complete certainty. It can, by simulating many futures, draw a survival curve, showing how your risk of death changes with age. This is simply risk stratification and planning, with uncertainty bands and competing risks baked in, rather than fortune telling.
The data bottleneck
The researchers acknowledge several potential problems with their design. One is related to the data they used: UK Biobank recruits people aged 40-70. By this age, some people have already died, and the absence of their health trajectories from the data creates bias. The follow-up is limited in time, and hence, it misses people older than 80, meaning that very old-age dynamics are not represented.
Scarcity of data is a major bottleneck for most large AI models in biology. Health systems in various countries sit on mountains of health data spanning from birth to death. Figuring out ways to free this data, while addressing safety concerns and maintaining anonymity, can help accelerate progress in this area. Data from wearables, which are becoming increasingly popular, is another promising source.
Yet another way to augment data for large models is by using synthetic data: that is, data simulated by the model itself. Delphi-2M was asked to generate fake patient histories, and then a new model was trained only on those synthetic records without any real people’s data. Surprisingly, the model trained exclusively on synthetic data was almost as accurate in its predictions as the original model that used UK Biobank data. This approach can help solve the anonymity problem by minimizing the use of real patients’ data.
Literature
[1] Shmatko, A., Jung, A. W., Gaurav, K., Brunak, S., Mortensen, L. H., Birney, E., … & Gerstung, M. (2025). Learning the natural history of human disease with generative transformers. Nature, 1-9.
View the article at lifespan.io