May 13, 2025

The UK just trained a health AI on 57 million people to predict disease

The UK Just Trained A Health AI On 57 Million People To Predict Disease' health trajectories
Given enough and the correct data, Foresight could predict patients’ health trajectories. Credit: Wikimedia Commons

The NHS (National Health Service) is the publicly funded healthcare system of the United Kingdom. Its main aim is to provide free medical care to residents at the point of use. But the NHS is also working on a very ambitious research project.

Deep within secure data servers, a powerful new artificial intelligence model has been quietly learning from the lives of nearly every person in the country. The model, called Foresight, has been fed 10 billion fragments of medical history — from hospital visits and COVID-19 vaccinations to deaths — all drawn from the anonymized records of 57 million people.

Its goal is to forecast future illness, anticipate hospitalizations, and guide a sweeping shift from reactive to preventative healthcare.

“This is the first time an AI model has been used within health research on 57 million people,” said Angela Wood, a health-data scientist at the University of Cambridge, during a press briefing. “This is a real step forward.”

What Can Foresight See?

Foresight builds on the same principles as ChatGPT, using a large language model to learn patterns — but instead of completing sentences, it completes health trajectories.

Initially developed in 2023 using GPT-3 and just 1.5 million NHS records from London, Foresight has massively grown in scope and sophistication. Its newest version is based on Meta’s open-source LLaMA 2 model and trained on eight national datasets spanning five years of health events.

The UK Just Trained A Health AI On 57 Million People To Predict Disease
AI-generated image.

Dr. Chris Tomlinson, a health-data scientist at University College London and one of the model’s creators, called it “the world’s first national-scale generative AI model of health data.” Speaking at the launch event, he emphasized its transformative potential: “The real potential of Foresight is to predict disease complications before they happen, giving us a valuable window to intervene early, and enabling a shift towards more preventative healthcare at scale.”

For example, he said, it might someday allow clinicians to predict a patient’s risk of unscheduled hospitalization (a common precursor to serious deterioration) and take action before that decline begins. That action might include adjusting medications or targeting interventions based on subtle patterns in the data.

<!– Tag ID: zmescience_300x250_InContent_3

[jeg_zmescience_ad_auto size=”__300x250″ id=”zmescience_300x250_InContent_3″]

–>

The pilot study currently limits Foresight’s use to COVID-19-related research. But even within this narrow scope, researchers are pushing the boundaries. For now, the model is in tests. Researchers want to see whether the model can predict over 1,000 conditions using past health records from 2018 to 2022.

“That allows us to actually get as close to a ground truth as is possible,” Tomlinson explained.

Privacy, Power, and the Public

The sheer scale of Foresight is both its strength and its greatest liability. It has access to a lot of data from a lot of people.

Researchers have “de-identified” all the data used to train the. They’ve removed names, birthdates, and addresses. Yet experts caution that anonymity at this scale can never be guaranteed.

“Building powerful generative AI models that protect patient privacy is an open, unsolved scientific problem,” Luc Rocher, a data-privacy researcher at the University of Oxford told New Scientist. “The very richness of data that makes it valuable for AI also makes it incredibly hard to anonymize. These models should remain under strict NHS control where they can be safely used.”

Patients cannot fully opt-out. Those who have declined to share their GP records will be excluded. However, other data sources — hospital visits, vaccination records, national registries — are not covered by the opt-out. And once a model like Foresight is trained, it is impossible to remove an individual’s record from the model’s memory.

Michael Chapman, director of data access at NHS England, acknowledged the concern. “It’s very hard with rich health data to give 100 per cent certainty that somebody couldn’t be spotted in that dataset,” he said. Still, the AI is confined to a secure environment and supervised by NHS researchers. Even cloud providers like Amazon Web Services and Databricks, which supply the computing infrastructure, cannot access the data.

Even Foresight’s legal status remains a gray area. Under the UK’s interpretation of GDPR, anonymized data isn’t covered. But the Information Commissioner’s Office warns against conflating “de-identified” with truly anonymous data.

“Even if it is being anonymised, it’s something that people feel very strongly about from an ethical point of view… the humans and the ethics need to be the starting point.” said Caroline Green, a digital ethics researcher at Oxford - The UK Just Trained A Health AI On 57 Million People To Predict Disease
“Even if it is being anonymised, it’s something that people feel very strongly about from an ethical point of view… the humans and the ethics need to be the starting point.” said Caroline Green, a digital ethics researcher at Oxford. Credit: Wikimedia Commons

A Microcosm of AI in Society

This case study is an example of exactly how AI and society interact.

If Foresight performs as its developers hope, it could mark a turning point in the way national health systems are managed. It could help clinicians personalize care with unprecedented precision and lag patients on the brink of crisis. But it puts a lot of private date at risk, and we’re not entirely sure whether it will perform as hoped.

“This technology is transforming what’s possible in tackling a host of debilitating diseases,” Kyle told The Independent. “From diagnosis, to treatment, to prevention.”

But the work is still in its early stages. Foresight is not yet making real-time predictions for patients. Researchers are still testing its accuracy across different demographics and disease types. Its ability to avoid privacy breaches is still unproven.

Still, the long-term success of Foresight may depend less on its code and more on public trust. If people believe their data is being used without consent, the project could lose the social license it depends on. In the race to harness AI in medicine, can the urgency of innovation be reconciled with the imperatives of ethics and accountability?

That is a question we are yet to have any Foresight on.