Ten Questions for Future Regulation of Big Data: A Comparative and Empirical Legal Study

Author:Bart van der Sloot - Sascha van Schendel

Much has been written about Big Data from a technical, economical, juridical and ethical perspective. Still, very little empirical and comparative data is available on how Big Data is approached and regulated in Europe and beyond. This contribution makes a first effort to fill that gap by presenting the reactions to a survey on Big Data from the Data Protection Authorities of fourteen European... (see full summary)

Bart van der Sloot and Sascha van Schendel
Ten Questions for Future
Regulation of Big Data
A Comparative and Empirical Legal Study
by Bart van der Sloot and Sascha van Schendel*
Keywords: Big Data; Empirical; Comparative; Survey; Data Protection Authorities; Comparative Legal Research
ing the reactions to a survey on Big Data from the
Data Protection Authorities of fourteen European
countries and a comparative legal research of eleven
countries. This contribution presents those results,
addressing 10 challenges for the regulation of Big
Abstract: Much has been written about Big
Data from a technical, economical, juridical and ethi-
cal perspective. Still, very little empirical and compar-
ative data is available on how Big Data is approached
and regulated in Europe and beyond. This contribu-
tion makes a first effort to fill that gap by present-
A. Introduction
Big Data is a buzzword used frequently in both
the private and the public sector, the press, and
online media. Large amounts of money are being
invested to make companies Big Data-proof, and
governmental institutions are eager to experiment
with Big Data applications in the elds of crime
prevention, intelligence, and fraud, to name but a
few areas. Though the exact nature and delineation
of Big Data is still unclear, it seems likely that Big
Data will have an enormous impact on our daily lives.
Positively, undoubtedly, but there are also inherent
risks to Big Data applications, as it might result
in discrimination, privacy violations, and chilling
effects. The ideal situation would be to have an
adequate framework in place that will ensure that
the benecial uses of Big Data are promoted and
facilitated, while the negative effects are mitigated
or sanctioned. This contribution provides building
blocks for developing such a framework, by giving an
overview of the experience in the use and regulation
of Big Data in 23 countries, aiming in particular at the
use of Big Data by governments.
The research presented in this article was conducted
in two phases. The rst phase involved desk research
and looked at Big Data policies, legislation and
regulation in a number of countries. Second, a
questionnaire was sent to several European DPAs.
The desk research examined eleven countries. These
countries were selected on the basis of three criteria.
The rst was global coverage – the research sought
to be as representative as possible to provide a full
picture of global developments in relation to Big Data,
which is by nature an international phenomenon.
Therefore, at least one country from each continent
(with the exception of Antarctica) was examined. The
second criterion was an estimation of the potential
value of the expected outcomes of the research –
some countries are more innovative and ambitious
Ten Questions for Future Regulation of Big Data
than others in terms of technological developments
such as Big Data. Thirdly, the role a country plays
in international politics was taken into account;
on that basis, China rather than South Korea was
studied, even though the latter country is often in
the forefront of technological developments. Based
on these three criteria Australia, Brazil, China,
France, Germany, India, Israel, Japan, South Africa,
the United Kingdom and the United States were
selected. The desk research focused on two issues in
particular. First, government policy decisions were
analyzed, as were initiatives related to this topic,
such as governments using Big Data themselves or
stimulating the use of Big Data in the private sector,
either through nancial support or by engaging in
partnerships. Second, research was carried out on
legislation and case law revolving around Big Data in
the selected countries. It should, again, be noted that
this study is not exhaustive – there is, undoubtedly, a
myriad of relevant laws, court cases and DPA reports
that are not discussed here.
In studying the eleven countries, almost exclusive use
was made of ofcial sources, especially government
websites. The reason for this is that it is often difcult
to establish the reliability of foreign sources. This
choice does, however, imply that this article mainly
presents a picture of the governmental view of Big
Data and of governmental regulation. Criticism of
those initiatives and autonomous processes in the
private sector remain largely undiscussed. This bias
was accepted as a tradeoff in order to guarantee the
reliability of the sources studied. When discussing
Israel, however, use was made of online newspaper
articles from Israeli news sources and a published
online interview, because this provided vital
information and because the news-source was
regarded as reliable. The information from these
sources was not available on government websites,
but was nonetheless considered essential.
Publications on government websites and in press
releases about new initiatives were selected by
using terms related to Big Data, both in the ofcial
language of the country concerned and in English,
such as ‘data mining’, ‘data analytics’, ‘data projects’,
‘Big Data initiatives’, etc. Several countries have a
Ministry of Science and Technology, or a similar
ministry. Those ministries were taken as the starting
point of the research in those countries. General
search engines were also used to scan government
initiatives related to Big Data, by limiting the
search to the national public domain of the country
concerned. For case law and legislation, the ofcial
national search engines and general search engines
were used. The search terms entered here were
related to Big Data, privacy and data protection,
such as ‘data protection’, ‘privacy’, ‘surveillance’, etc.
This process yielded a list of government initiatives,
legislation and relevant jurisprudence. The sources
consulted and the full list of references used for
this article are listed in a working paper published
5 The results of the comparative desk research can be
found in Appendix I and the results of the survey in
Appendix II to this contribution. It has to be stressed
that not all governments and governmental agencies
use the term Big Data when creating, operating on,
or using large scale data bases. That is why this study
primarily identies those initiatives that have been
identied as Big Data by the government itself, or
when it has used terms that are related to it. This
means that many uses of large scale databases by
governmental agencies are not included in this
study. When analyzing the countries, six questions
were kept in mind: ‘Is a specic denition of Big Data
used?’, ‘Is Big Data used within the government?’, ‘Is
there a public-private partnership?’, ‘To what goal
is Big Data used by the government?’, ‘Which laws
are especially relevant for Big Data?’ and ‘Are there
judicial decisions relating to Big Data?’
A relatively short and simple questionnaire was
designed for the survey, so as to increase the
potential response of the DPAs. The accompanying
email, as well as the introduction to the survey,
briey explained the goal of the survey. The survey
comprised six questions: 1. Are you familiar with
the debate on Big Data? If so, how would you dene
Big Data? (max. 500 words) 2. Are there prominent
examples of the use of Big Data in your country,
especially in the law enforcement sector, by the
police or by intelligence services? (max. 500 words)
3. Have you issued any decisions/reports/opinions
on the use of Big Data? If so, could you provide us
with a reference and your main argument? (max.
500 words) 4. Are there any legal cases/judgements
by a court with regard to (privacy/data protection)
violations following from Big Data practices in your
country? If so, could you provide us with a reference
and the main consideration of the court? (max.
500 words) 5. Which legal regimes are applied to
Big Data/ is there a special regime for Big Data in
your country? Are there any discussions/plans in
parliament to introduce new legislation to regulate
Big Data practices? (max. 500 words) 6. Are there any
nal remarks you want to make/suggestions you
have for further research? (max. 500 words)
The reason for choosing these questions for the desk
research and the survey is that the background of
this study is a project by the Netherlands Scientic
Council for Government Policy (WRR). The WRR
1 <http://www.wrr.nl/leadmin/en/publicaties/PDF-
Legal_Study_on_Big_Data.pdf>. The literature studied
for this article can be found here. <http://www.wrr.nl/
Bart van der Sloot and Sascha van Schendel
is an independent advisory body for the Dutch
government. The task of the WRR is to advise the
government on issues that are of great importance
for society in the intermediate and longer term. The
reports of the WRR are not tied to one policy sector
but rather touch on various terrains and policy
sectors; they are concerned with the direction of
government policy for the longer term. The members
of the WRR are established university professors
who have often worked on policy related subjects
and/or have made tracks in public administration
themselves. The Dutch government had requested
the WRR to advise on the regulation of Big Data,
taking into account how privacy and security
should be assessed in the deployment of big data
analytics in security related policies. Questions that
were suggested to be addressed include whether
a distinction needs to be made between access to
and use of data, how transparency and individual
rights can be guaranteed in Big Data practices and
what the likely impact of the emergence of quantum
computing will be. In addition to the policy advice,
published in the form of a report for the Dutch
a scientic book was delivered
a number of working papers were written to do
indicative research,
which were used as building
blocks for the report to the government. This article
is based on one of those working papers.5
8 The DPAs in all 28 EU Member States were emailed
with a request to complete the survey. Requests were
also sent to the DPAs in three non-EU countries,
namely Norway, Serbia and Switzerland, because a
short preliminary study had shown that they might
have specic expertise in relation to Big Data. DPAs
that did not respond within the period specied in
the initial request were sent a reminder; those that
did not respond to this mail either were sent a nal
reminder. In most cases, the questionnaire was sent
to the general contact address as posted on DPA’s
website. However, since the French website lists
no general email address, personal contacts were
used to email two specic employees of the CNIL.
For three other DPAs (Germany, the Netherlands
and Norway), in addition to an email to the general
email address, an email was also sent to a specic
individual employee. For other DPAs, either no such
personal contacts existed or they existed but it was
not necessary to use them because a response had
been received. Eventually, of the 31 DPAs included in
2 <http://www.wrr.nl/leadmin/nl/publicaties/PDF-
3 <http://www.wrr.nl/leadmin/en/publicaties/
4 <http://www.wrr.nl/publicaties/working-papers/>.
5 <http://www.wrr.nl/leadmin/en/publicaties/PDF-
the survey, 18 responded: Austria, Belgium, Croatia,
Denmark, Estonia, Finland, France, Hungary, Ireland,
Latvia, Lithuania, Luxembourg, the Netherlands,
Norway, Slovakia, Slovenia, Sweden and the United
Kingdom. Four of these (Austria, Denmark, Finland
and Ireland) were negative responses, stating that
the DPA in question would not participate in the
study. Consequently, about half of the DPAs invited
to join the survey have actually responded. The
results found in this study can, therefore, not be seen
as determinative but as indicative of possible trends,
feelings and attitudes towards Big Data. It should be
taken into account that those DPAs that have already
dealt with Big Data projects would be more likely to
respond to such a survey than those that haven’t.
Rather than presenting the bare facts, listing the
regulatory initiatives in the various countries studied
and the answers from the DPAs, this article uses the
insights gained from those results to shine light on
some of the most difcult questions regulators have
to answer when deciding on future regulation of
Big Data. These questions are partly based on those
asked in the survey and partly follow from the desk
research. Additional questions have been added in
order to present the most interesting ndings from
both the desk research and the survey in an orderly
fashion. Ten issues/questions are discussed in more
detail: (1) What is the denition of Big Data? (2) Is
Big Data an independent phenomenon? (3) Big Data:
fact or ction? (4) What is the scope of Big Data? (5)
What are the opportunities for Big Data? (6) What
are the dangers of Big Data? (7) Are the current laws
and regulations applicable to Big Data? (8) Is there
a need for new legislation for Big Data? (9) What
concept should be central to Big Data regulation?
(10) How should the responsibilities be distributed?
These questions will be discussed in the subsequent
sections. The article will conclude with a short
summary of the main ndings.
B. What is the definition of Big Data?
The rst choice when it comes to regulating Big
Data is to determine a denition and delineation
of Big Data. Three denitions were encountered
a number of times in both the desk research and
in the survey. First, the Article 29 Working Party
holds that Big Data refers to the exponential growth,
both in the availability and in the automated use
of information. It refers to gigantic digital datasets
held by corporations, governments and other large
organizations, which are then extensively analyzed
using computer algorithms. Big Data can, according
to the Working Party, be used to identify more
general trends and correlations, but it can also be

