Open problems in medical federated learning

DOIhttps://doi.org/10.1108/IJWIS-04-2022-0080
Published date20 September 2022
Date20 September 2022
Pages77-99
Subject MatterInformation & knowledge management,Information & communications technology,Information systems,Library & information science,Information behaviour & retrieval,Metadata,Internet
AuthorJoo Hun Yoo,Hyejun Jeong,Jaehyeok Lee,Tai-Myoung Chung
Open problems in medical
federated learning
Joo Hun Yoo
Department of Articial Intelligence, College of Computing and Informatics,
Sungkyunkwan University, Suwon, Republic of Korea, and
Hyejun Jeong,Jaehyeok Lee and Tai-Myoung Chung
Department of Computer Science and Engineering, College of Computing and
Informatics, Sungkyunkwan University, Suwon, Republic of Korea
Abstract
Purpose This study aims to summarize the critical issues in medical federated learning and
applicable solutions. Also, detailed explanations of how federated learning techniques can be
applied to the medical eld are presented. About 80 reference studies desc ribed in the eld were
reviewed, and the federated learning frameworkcurrently being developed by the research team is
provided. This paper will help researchers to build an actual medical federated learning
environment.
Design/methodology/approach Since machine learning techniquesemerged, more efcient analysis
was possible witha large amount of data. However, data regulationshave been tightened worldwide, and the
usage of centralized machinelearning methods has become almost infeasible. Federated learningtechniques
have been introduced as a solution. Even withits powerful structural advantages, there still exist unsolved
challenges in federatedlearning in a real medical data environment. This paper aims to summarizethose by
categoryand presents possible solutions.
Findings This paper provides fourcritical categorized issues to be aware of when applying the federated
learning technique to the actual medical data environment, then provides general guidelines for building a
federatedlearning environment as a solution.
Originality/value Existing studies have dealt with issues such as heterogeneity problems in the
federated learning environmentitself, but those were lacking on how these issues incur problems in actual
working tasks. Therefore, this paper helps researchers understand the federated learning issues through
examplesof actual medical machine learning environments.
Keywords Heterogeneity, Data security, Data privacy, Federated learning, Incentive mechanism,
Medical application
Paper type Research paper
© Joo Hun Yoo, Hyejun Jeong, Jaehyeok Lee and Tai-Myoung Chung. Published by Emerald
Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0)
licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for
both commercial and non-commercial purposes), subject to full attribution to the original publication
and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/
legalcode
This research is supported by Institute of Information & Communications Technology Planning &
Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2019-0-00421, AI Graduate School
Support Program(Sungkyunkwan University)) and funded by Institute of Information & Communications
Technology Planning & Evaluation (IITP) grand funded by the Korea government(MSIT)(No.2020-0-00990,
Platform Development and Proof High Trust & Low Latency Processing for Heterogeneous·Atypical·Large
Scaled Data in 5G-IoT Environment).
Medical
federated
learning
77
Received15 April 2022
Revised14 June 2022
Accepted14 June 2022
InternationalJournal of Web
InformationSystems
Vol.18 No. 2/3, 2022
pp. 77-99
EmeraldPublishing Limited
1744-0084
DOI 10.1108/IJWIS-04-2022-0080
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/1744-0084.htm
1. Introduction
Machine learning has been widely studied in various research elds for its powerful
performance in data analysis. It was possible to derivebetter results through machine learning
methods by learning the hidden multi-dimensional characteristics of given data that were
difcult for humans to distinguish. This structure of machine learning in the medical imaging
eld, where it is crucial to capture ne features in images, has been very helpful in
strengthening the existing diagnostic approaches. For example, support vector machines, deep
neural networks, convolutions and clustering techniques have been applied in the medical eld
to effectively search those human-unidentiable correlations from medical data.
Through the active use of machine learning approaches, the medical eld was able to expand
its scope to specic medical elds such as radiology, pathology, neuroscience, genetics and even
mental disorders. However, the biggest issue in the eld of medical articial intelligence (AI) is
not the accuracy of diagnosis, but the protection of patientspersonal information.
Federated learning, a machine learning algorithm based on the distributed data
environment, has emerged under stricter data regulations laws around the world. When the
concept of federated learning was rst introduced, data privacy regulations such as the EUs
General Data Protection Regulation, Californias(CAs) Privacy Rights Act and Chinas
Personal Information Protection were representative rules, but now more countries around the
world are implementing efcient regulations, such as Brazils Lei Geral de Prote,c~ao de Dados,
Canadas Digital Charter Implementation and Singapores Personal Data Protection Act, to
protect their citizenspersonal information. Thus, centralized machine learning methods, that
collect and learn based on the proper amount of data, are no longer applicable under the
personal data protection regulations. In particular, for medical data, researchers and business
providers should follow the Health Insurance Portability and Accountability Act (HIPAA),
which comprehensively protects the medical records and independently identiable health
information of patients and medical information providers. With numerous increasing data
regulations, researchers have applied various solutions to prevent invasion of privacy.
First, the most common solutions to adopt for data privacy issues are to process and
import variables that can identify individual users when collecting their data. Primary
information leakage can be prevented through the measures such as secure aggregation,
pseudonymization, data reduction, data suppression and data masking in normal data
environments. However,personal health information (PHI) is difcult to apply these security
methods, as it contains any format of information thatcan identify the data owner. PHI is a
wider concept of personal identiable information (PII), which means sensitive information
such as health insurance records, medical numbers, health status, medical images and
mental health records are included on top of basic user variables. Therefore, data security
methods for PHI are difcult to completely protect the information and to use medical data
efciently and safely,structural solutions are required to learn without violations.
Federated learning is a structural solution for the existing data privacy violation
problems of machine learning methods. It has a unique structure and characteristics
compared to centralized machinelearning. Traditional machine learning approaches require
a large volume of training data collected from local data owners to the server for model
generation. Federated learning, a decentralized learning structure, generates and develops
deep neural network models without local data collection to the server. The core concepts
used in the neural networkmodel learning process are as below.
Individual clientsdata stored in the local environment does not move, and the server
generates the initial training model and delivers it to each participating client. The transferred
initial training model goes through a model update process through learning within each
clients data environment, and the server collects all of the corresponding results to create a
IJWIS
18,2/3
78

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT