To date, this remains Russia’s only implementation of federated learning for medical ML models.

Federated learning of neural networks
We explain what federated learning is, how it enables training ML models on sensitive data, and what role cloud technology plays in making it possible.
Sechenov University (Russia’s Sechenov First Moscow State Medical University) is a strategic healthcare partner of Yandex Cloud for disease diagnostics.
Institute for System Programming of the Russian Academy of Sciences.
Federated learning is a method for training ML models where original data physically remains with its owner and is never shared with any external contractors. Compliant with the Russian Federal Law 152-FZ on personal data, this approach is particularly suitable for handling such sensitive information as medical records.
A key advantage is that you do not need to keep the data structure identical across sources. Different healthcare institutions may collect different patient data, yet the technology enables efficient model training even on such heterogeneous datasets.
Federated learning allows pooling resources from multiple organizations without centralized data exchange: the models train locally on each contributor’s data, regardless of format differences.
In an experiment by Yandex Cloud in collaboration with Sechenov University and ISP RAS, the method demonstrated its real-world viability by successfully detecting cardiac pathologies in ECG data. This partnership aims to validate federated learning’s efficiency in healthcare, where data privacy is paramount.
In this article, we will share key highlights of our experiment, showing how federated learning helped address the challenges of both data heterogeneity and scarcity. We will also discuss the results of applying this approach and explore its opportunities and prospects in the nearest future of healthcare.
How federated learning works
Typically, when training AI models, you collect data from a vast range of sources, such as archives from different medical institutions, healthcare information systems, or public datasets. Then you send all this data to a single central server for processing, where your model trains on them.
This traditional approach, however, poses serious privacy risks, especially in such sensitive domains as healthcare. Among such risks, one may mention possible data leaks or unauthorized access to patients' personal medical information.
Federated learning works differently. Instead of gathering all data at once, your model trains locally on the data owner’s server, which acts as a client within a distributed learning system. The data remains with its owner, while only updated model parameters are sent to the central federated server, combined there with updates from other data owners.
This approach preserves data privacy since any raw data never leaves its owner’s system. Thus, medical institutions are free to collaborate and pool their resources without compromising patient privacy, ultimately improving the performance of the AI models.

Federated learning architecture. To refine and enhance the overall model, the central federated server aggregates depersonalized model updates from all contributing clients.
Unlike traditional centralized systems, this central federated server does not process or store any source data. Instead, it coordinates the learning process by aggregating model parameters to improve the overall model, while ensuring data privacy.
Once the model parameters from all clients are combined, an updated version of the model goes back to its client devices to keep learning based on their local data.
By adapting to changes and unique features of each client, the model gets more accurate and efficient over time. The process repeats until it reaches the desired performance metrics, such as prediction accuracy and general stability.
The federated server sends certain tasks to its clients; such tasks include the model code and configuration settings. However, not every user can send such a task, since there are different roles with different permissions. Once created, a task gets validated both on the server side and on each client side. Client devices run only those tasks that pass such validations. After that, the client runs another verification to determine whether the training results can be safely sent back to the server. These multi-layered validation steps help ensure the security and integrity of the data throughout the process.

Federated learning architecture. Each federated server independently checks both task code and configuration to ensure they meet security requirements and comply with established standards.
Individual data about objects, such as information on each specific patient.
Properties or attributes that describe these records, such as age or lab results.
Certified Yandex Cloud partner.
Enables model training within a distributed system.
Once a task is validated successfully, the model begins training on the local data; if failed to validate, the task is rejected. This prevents data leaks and ensures that learning happens in a secure and trusted environment.
Federated learning types
Federated learning can be classified into horizontal, vertical, and transfer.
More about federated learning
Federated learning types |
Description |
Example |
Horizontal FL |
A model trains on data from multiple sources that have different records but share the same features. |
For example, different hospitals (sources) may store information about various patients (records) of different genders and ages who show similar symptoms (features). |
Vertical FL |
A model trains on data from multiple sources that share the same records but provide different features. |
Imagine an employee using private medical insurance to visit multiple clinics and receive different healthcare services. Each clinic collects its own set of data about that person, such as lab results or screening records, leading to diverse feature sets. |
Transfer FL |
A model trains on data from multiple sources where both the records and features partially overlap. |
For instance, a patient might visit both public clinics and private medical centers. Public facilities may have basic check-up data, while private ones offer specialized service records. The model uses the overlapping parts of this information for efficient training. |
In a federated learning setup, the model trains directly on the data owner’s infrastructure, while the costs are typically covered by the customer or organization commissioning the project. If a team lacks servers equipped with graphical processing units (GPUs), which are essential for intensive computations, they can rent the required infrastructure from a cloud provider. GPUs significantly accelerate the training of such complex models as neural networks. A hybrid approach is also possible, where some data is stored and processed locally, and other parts are handled in the cloud.
Cloud providers offer per-second billing for computing resources, allowing developers to pay only for what they actually use and avoid unnecessary expenses.
Training an ML model on medical data: a case by Yandex Cloud, ISP RAS, and Sechenov University
The experiment aimed at training an ML model to recognize cardiac pathologies based on ECG data, adopting the NVFlare
The project came to life on a hybrid infrastructure, with part of resources deployed in the ISP RAS own data center, while Sechenov University, lacking sufficient in-house computing power, relied on Yandex Cloud.
Software engineers from IMPRUV IT developed a setup to deploy the framework in the cloud and configure it with a single click, ensuring that computing resources were used only during training, and not wasted otherwise.

The flowchart below shows the main components of the system used in the experiment. The FL server and Sechenov University’s FL-client are hosted within the Yandex Cloud infrastructure, while ISP RAS deployed its FL client on the on-premises infrastructure. Additionally, the framework employs a special client, Admin Console, for cluster administration and training management. The FL server coordinates all FL clients; all their connections use gRPC, while each client operates its own private dataset via S3. Furthermore, the system uses an S3 bucket containing one-time keys to securely distribute configurations to the FL clients.
This is the standard form of electrocardiography, which uses 12 electrodes placed at various locations over the patient’s body. Each electrode captures the heart’s electrical activity from a different angle, giving healthcare professionals a more complete and accurate picture of how the heart is functioning. This type of ECG is particularly useful for detecting a wide range of pathologies, including arrhythmia and ischemia.
This refers to the model’s ability to correctly identify cases of pathology.
This refers to the model’s ability to correctly avoid false positives when there is no disease to detect.
Armed with this approach, the team successfully tested the infrastructure using low-cost VMs, only bringing in expensive GPU resources when actually needed for model training. This significantly reduced overall computing costs.
The experiment involved two datasets: one publicly available open dataset and a second, proprietary dataset provided by the Sechenov University. Both datasets were labeled in a unified format
ISP RAS adapted its existing ECG classifier so that it could function within the federated learning framework, and then initiated the training process. The train dataset consisted of 47,000 twelve-lead ECG recordings, 30,000 from ISP RAS and 17,000 from Sechenov University.
During the experiment, engineers trained the model to diagnose atrial fibrillation, having achieved a sensitivity of 99% and a specificity of 95%. To verify the accuracy and clinical applicability of the model, its results were reviewed by three functional diagnostic physicians with over decade of experience each.
Prospects of federated learning in modern healthcare
Today, federated learning is seen globally as a groundbreaker with a promising future for AI development in the domains involving sensitive data. Healthcare has emerged as one of the most interested sectors, accounting for 36% of the market. This high level of interest stems from the technology’s potential in areas such as medical image analysis and drug development. Federated learning can also be applied to train predictive models using patient electronic medical records (EMRs), and even to develop large language models (LLMs) based on medical data, thus furthering and expanding AI-driven healthcare innovation.
Primarily, rhythm disorders.
Heart rate.
“Neural networks have not yet achieved complete perfection in medical diagnostics, and that is not even necessary. After all, the goal is to use AI as a tool to assist doctors, not replace them. Clinical decision support systems (CDSS’s) take over tasks that can be automated, especially routine procedures that consume time and resources. However, the final diagnosis remains in the hands of the physician, who can always manually review key indicators, account for factors the system may have missed, and form a final, well-informed opinion.”
The Institute of Personalized Cardiology at Sechenov University, together with researchers from ISP RAS and the Institute for Design-Technological Informatics of the Russian Academy of Sciences (IDTI RAS), is developing devices that integrate with CDSS’s to aid in remote heart monitoring.
These devices detect cardiovascular diseases and help process ECG data, as well as HR and pulse wave measurements. The model assists doctors in interpreting the collected data and provides a preliminary diagnosis; if needed, one can share results with colleagues for advice.
“At ISP RAS, developing federated learning algorithms is part of our broader effort to build trusted AI systems. With support from the Ministry of Science and Higher Education of Russia, we have launched a youth lab this year, focused on training large-scale models for healthcare-related tasks. We also prioritize data protection and preventing threats that could compromise the quality of learning. This research has significant scientific and practical value, confirmed by our successful collaboration with Sechenov University and Yandex Cloud.”
Federated learning is not limited to healthcare: it offers extensive prospects in other domains as well. For example, in finance, it can be used for fraud detection and credit risk assessment. Importantly, customer data remains secure, which is crucial when handling sensitive information.
Throughout implementing the project of training an ML model on medical data, the team was backed by Yandex Social Tech.
In projects related to science and education, healthcare, environment, and cultural initiatives, Yandex Cloud acts as a technology partner. Particularly, it assesses the project feasibility, develops the IT architecture, provides free access to technology stack and expert advice, and also offers marketing and PR support. To submit an application for partnership with Yandex Cloud, visit the Yandex Social Tech website.