Inference attacks based on GAN in federated learning

DOIhttps://doi.org/10.1108/IJWIS-04-2022-0078
Published date30 August 2022
Date30 August 2022
Pages117-136
Subject MatterInformation & knowledge management,Information & communications technology,Information systems,Library & information science,Information behaviour & retrieval,Metadata,Internet
AuthorTrung Ha,Tran Khanh Dang
Inference attacks based on GAN
in federated learning
Trung Ha
Department of Information System, University of Information Technology,
Vietnam National University, Ho Chi Minh City, Vietnam, and
Tran Khanh Dang
Faculty of Information Technology, Ho Chi Minh City University of Food Industry,
Ho Chi Minh City, Vietnam
Abstract
Purpose In the digital age, organizationswant to build a more powerful machine learningmodel that can
serve the increasing needs of people.However, enhancing privacy and data security is one of the challenges
for machine learning models,especially in federated learning. Parties want to collaboratewith each other to
build a better model, but theydo not want to reveal their own data. This study aims to introduce threats and
defensesto privacy leaks in the collaborative learning model.
Design/methodology/approach In the collaborative model, the attacker was the central server or a
participant. In this study, the attacker is on the side of the participant, who is honest but curious.Attack
experiments are on the participants side, who performs two tasks: one is to train the collaborative learning
model; the second task is to build a generative adversarial networks (GANs) model, which will perform the attack
to infer more information received from the central server. There are three typical types of attacks: white box,
black box without auxiliary information and black box with auxiliary information. The experimental
environment is set up by PyTorch on Google Colab platform running on graphics processingunit with labeled
faces in the wild and Canadian Institute For Advanced Research-10 data sets.
Findings The paper assumes that the privacy leakage attack resides on the participantsside,andthe
information in the parameter server contains too much knowledge to train a collaborative machine learning model.
This study compares the success level of inference attack from model parameters based on GAN models. There
are three GAN models, which are used in this method: condition GAN, control GAN and Wasserstein generative
adversarial networks (WGAN). Of these three models, the WGAN model has proven to obtain the highest stability.
Originality/value The concern about privacy and security for machine learning models are more
important, especially for collaborative learning. The paper has contributed experimentally to privateattack
on the participantside in the collaborative learning model.
Keywords Federated learning, Privacy, GAN, Threat, Defense
Paper type Research paper
1. Introduction
The large amounts of data is produced every day by computing devices that have been
becoming more and more popular. Collecting data into centralized storage facilities is
expensive and time consuming. Another important concern is data privacy and user
security, as the usage of data often contained sensitive information (Abadi et al., 2016).
Sensitive data such as facial images, healthinformation or location-based services could be
targeted for advertising or recommended on social networks that pose immediate or latent
This research is funded by University of Information TechnologyVietnam National University Ho
Chi Minh City under grant number D1202210.
GAN in
federated
learning
117
Received14 April 2022
Revised20 June 2022
Accepted29 July 2022
InternationalJournal of Web
InformationSystems
Vol.18 No. 2/3, 2022
pp. 117-136
© Emerald Publishing Limited
1744-0084
DOI 10.1108/IJWIS-04-2022-0078
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/1744-0084.htm
privacy risks. Therefore, private data should not be shared directly without any privacy
considerations. As society became more and more aware of the protection of privacy, legal
restrictions such as the General Data Protection Regulation (GDPR) were put in place that
made data aggregation activitiesless viable (Yang et al., 2019a, 2019b).
Machine learning models have faced two major challenges (Li et al., 2018). First, in most
industries, data existed in a separate form. The second was to enhance privacy and data
security. One possiblesolution to these challenges was secure federated learning. In addition
to the federated learning framework rst proposed by Google in 2016, a completely secure
federated learning framework was introduced by Qiang Yang, including horizontal
federated learning,vertical federated learning and transfer federated learning.
At the same time, with the increasing awareness of large companies infringing on data
security and user privacy, the emphasis on data privacy and security has become a major
issue in the world.
Data leaks are causing great concern in society and governments. In response to data
security and user privacy, some countries are stepping up with legislation to protect data
security and privacy. The GDPR (Yang et al., 2019a,2019b), which was enforced by the
European Union on 25th May 2018, was intended to protect privacy and security data. It
also required businesses to use clear and transparent words for their user agreements and
granted users a right to be forgotten,meaning that users could delete or withdraw their
personal data. If companies were violated,they would pay the penalty as regulated. Besides
that, the same regulations for relatedprivacy and security are being enacted in the USA and
China. For example, Chinas Cybersecurity Law and the General Principles of Civil Law
were enacted in 2017, which required internet businesses not to leak or forge personal
information they collected; when conducting data transactions with third parties, they
needed to ensure that the proposed contracts were subject to legal data protection
obligations (Yang et al., 2019a,2019b).Establishing these regulations will obviously help to
have a more civilized society, but will also introduce new challenges to the commonly used
data transaction proceduresin machine learning.
In this article, an overview of a new approach, called federated learning,is a possible
solution to these challenges, which are the leakage of data privacy in centralized learning
model and compliance with the regulations, and laws of the local countries. This study
presented the denition, classication, threat and defense privacy leaks in federated
learning. Finally, it demonstrated an attack experiment using a generative adversarial
network (GAN) in federationlearning.
2. Background of federated learning
2.1 Federated learning
Determining N data owners {F
1
,...,F
N
}, all wanted to train the machine learningmodel by
merging their data {D
1
,...,D
N
}. A traditional method was to gather all the data together
and use D = D
1
|... |D
N
to train an M
SUM
model. A federated learning system is a
training process in which data ownerscooperatively train an M
FED
model, which processes
data belonging to any owner F
i
while not exposing data D
i
to others. In addition, the
accuracy of M
FED
, denoted V
FED
, should be close to the traditional learning model M
SUM
,
V
SUM
. Formally, supposing
d
is a nonnegative real number;if (Yang et al., 2019a,2019b):
jVFED VSUMj<
d
(1)
we said that the federated learning algorithmhas exactly
d
.
IJWIS
18,2/3
118

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT