Research
This interview is part of a broader editorial project by Imminent, featuring conversations with expert professionals collaborating on Physical AI — which begins when machines learn and interact with the real world in real time — within the DVPS project.
DVPS is among the most ambitious projects funded by the European Union in the field of artificial intelligence, backed by an initial investment of €29 million. It brings together 20 organizations from 9 countries to shape the next frontier of AI — one rooted in the interaction of machines with the real world. Building on the success of large language models, DVPS explores the future of AI through multimodal foundation models. Unlike current systems, which learn from representations of the world through text, images, and video, these next-generation models are designed to acquire real-time empirical knowledge via direct interaction with the physical world. By integrating linguistic, visual, and sensor data, they develop a deeper contextual awareness, enhancing human capabilities in situations where trust, precision, and adaptability are critical. The overall initiative is led by Translated, which coordinates the project’s vision and implementation. The team brings together 70 of Europe’s leading AI scientists. The potential applications span across several domains, including language, healthcare and environment.
Prof. Dr. Sandy Engelhardt has joined the DVPS project, representing the Heidelberg University Hospital, Germany and working on healthcare applications in the field of cardiology.
Prof. Dr. Sandy Engelhardt
Sandy Engelhardt, PhD, is Full Professor and Director of the Institute for Artificial Intelligence in Cardiovascular Medicine at Heidelberg University Hospital. Her research focuses on applying AI to cardiovascular precision medicine, integrating multimodal clinical data and real-world datasets to improve diagnosis, treatment, and surgical outcomes. She leads a multicentric Federated Learning initiative within DZHK, contributes to the EU Horizon DVPS Project on Foundation Models, and serves in leadership roles in the ESC and EACTS committees, including Programme Chair of the ESC Digital and AI Summit 2025.
You’ve built a career at the intersection of artificial intelligence and cardiovascular medicine, a unique and complex space. What initially drew you to this field?
My background is in computer science, but early in my PhD I was introduced to clinical cases involving cardiac patients. That experience was both fascinating and deeply rewarding, as I discovered how technical expertise could be applied to solve problems with a direct and meaningful impact on people’s lives. Over time, as I witnessed the complexity and urgency of cardiovascular care, it became clear to me just how much potential AI holds for supporting critical clinical decisions.
The heart is dynamic, multifunctional by nature—delivering oxygen-rich blood to body tissues, collecting oxygen-poor blood, sustaining blood pressure, directing blood through the lungs for oxygenation, and regulating circulation by adjusting the heart rate. This intricate system functions smoothly only when multiple subsystems work in perfect coordination, and here multimodality comes into play for diagnosing abnormalities, involving time-series signals, medical imaging, and electronic health records. For AI, this represents the ideal challenge: to merge diverse streams of data, identify patterns invisible to the human eye, and transform fragmented information into clear, actionable insights. It’s precisely this intersection of medical complexity, technological capability, and real-world impact that continues to inspire my work in the field.
How can AI-driven integration of multiple clinical data types improve disease detection and patient care?
Accuracy is fundamental in AI-driven disease detection, particularly because relying on a single type of clinical data often introduces significant uncertainty. In cardiovascular medicine, for example, a medical scan might appear normal on its own, yet when combined with ECG readings and laboratory results, AI can identify subtle abnormalities that might otherwise go unnoticed. This multimodal integration harnesses the power of correlations across diverse data types, allowing AI to uncover complex patterns and deliver far more precise and reliable diagnoses than any single modality could provide.
One of the most promising advancements in this field is modality conversion, which enables AI to infer or simulate one form of clinical data from another—such as generating a detailed 3D heart model from a 2D image or reconstructing missing ECG signals. This capability not only enhances diagnostic accuracy but also holds tremendous potential for improving care in low-resource environments and ensuring continuity when data is incomplete.

Clinical data types can be seen as different “languages” describing the same underlying condition, and AI functions as an interpreter, bridging these languages while preserving their medical meaning. By synthesizing multiple uncertain pieces of information into a cohesive and accurate whole, this approach is transforming our understanding and diagnosis of disease, and it remains a central focus of our ongoing research.
How are modern AI architectures, like Transformers, transforming the way patients and clinicians access complex insights in cardiovascular care?
One of the most promising frontiers in cardiovascular AI is the ability to extract rich diagnostic insights and risk predictions from low-cost, widely available signals like ECGs. This has the potential to revolutionize traditional diagnostic pathways, making early detection more accessible and cost-effective across healthcare systems. Yet generating insights is only half the battle—translating these complex findings into clear, understandable language for patients of diverse backgrounds and health literacy levels is equally critical. Effective patient communication is key to transforming clinical data into meaningful information, empowering individuals to make informed, prevention-focused decisions rather than acting out of fear or confusion. Here, linguistic technologies and language models play a pivotal role by bridging the gap between AI output and public trust, converting raw risk scores into human terms that enhance comprehension and agency.
At the heart of these advances are Transformer architectures and their attention mechanisms, which are reshaping medical AI across multiple domains. Initially celebrated for their performance in language tasks, these models are equally adept at integrating multimodal data—from imaging and time-series lab results to patient histories. By processing diverse inputs in parallel, Transformers can uncover subtle, cross-disciplinary relationships spanning radiology, cardiology, and genomics, enabling a truly holistic approach to diagnosis and risk assessment.
FEDERATED LEARNING
Federated learning is a decentralized, collaborative paradigm of model training in machine learning. Centered on the goal of privacy preservation, the core idea is to train one model across many clients without moving their respective data. Each client gets to keep its data locally while training and updating the model parameters on its end. Only the parameter updates from such potentially heterogeneous clients are aggregated centrally, thus preventing the transfer of raw data. At the start of the process, a central coordinator sends a model to all the participating clients. Over the course of learning, the coordinator keeps combining the parameter updates received to it from the sites to update the global model, and broadcasts the updated copy for the next round of training. With this framework, a model can learn from diverse real-world populations, while each source of data respecting its local regulation for privacy. Additional safeguards that are often used in this framework include cryptographic protocols to prevent viewing of a single site’s update and differential privacy that adds calibrated noise to limit what can be inferred about any individual. Depending on the nature of the client, the deployment of a federated learning process may be cross-device (across millions of intermittently connected phones, wearables or IoT devices that contribute small updates), or cross-silo (dozens of organizations with sizable datasets that collaborate under governance). Examples of federated learning applications in the medical domain include training models to gain understanding from radiology scans / ECGs / EEGs, outcome prediction from structured medical records such as EHRs, variant-effect prediction in genomics across cohorts that cannot share raw genetic data and digital health models for personal well-being from wearables data.
When trained on large, linked datasets, these models evolve into foundation models—versatile systems capable of tackling multiple complex tasks simultaneously, such as diagnosing a thickened heart muscle while predicting the likelihood of needing a pacemaker post-procedure. This is particularly exciting because fine-tuning foundation models for specific clinical questions requires far less data than their initial pre-training, thanks to the broad features they’ve already learned.
Ultimately, the transformative power lies in how modern AI architectures aren’t just advancing the technical integration of diverse clinical data—they’re also empowering clinicians and patients by translating that knowledge into accessible, trustworthy insights. This dual progress promises to democratize cardiovascular care and improve global outcomes. However, these breakthroughs—from multimodal integration to foundation models—depend on one critical factor: access to large, diverse, and representative clinical datasets. In medicine, data is often constrained by privacy regulations, institutional barriers, and national laws. That’s where federated learning comes in.
Federated learning has been central to your work. How would you explain what it is, and what potential does it have in an international setting?
Federated learning is a technique that allows AI models to be trained across multiple hospitals without sharing any sensitive patient data. Rather than transferring all data to a central location, the AI model travels to each institution, learns from the local data, and only sends back the learned patterns—never the raw data itself. These updates are combined to build a more accurate, robust model, while patient privacy remains fully protected.

This method holds great potential for international collaboration in healthcare. By keeping data within each country’s or hospital’s control, federated learning respects privacy laws like the European GDPR, and overcomes legal and technical barriers to data sharing. It enables researchers and clinicians across borders to work together on training AI systems, improving model accuracy and generalizability without compromising data security.
For instance, in our work with eight hospitals at the German Centre for Cardiovascular Research (DZHK), we successfully integrated diverse, partially annotated data across sites and pushed transformer AI models to function effectively in this complex setting. Looking ahead, scaling federated learning internationally could
revolutionize how clinical AI is developed and deployed worldwide, offering safer, more inclusive, and more effective healthcare solutions.
If you had to explain in one sentence—to someone completely outside your field—what makes your work impactful, what would you say?
I would say that my group at Heidelberg University Hospital develops novel AI approaches that “listen” to the heart in ways doctors can’t, and then help explain what it hears to the people who need to understand it most. I basically work at the intersection of AI, medicine, and law to ensure that data access can be provided safely, fairly, and meaningfully across borders to gain life-saving insights.
