Exploring the Interfaces Between Big Data and Intellectual Property Law

Document

Cited in

Author	Daniel Gervais
Position	Dr. Gervais is Professor of Information Law at the University of Amsterdam and the Milton R. Underwood Chair in Law at Vanderbilt University. The author is grateful to Drs. Balász Bodó, João Quintais, and to Svetlana Yakovleva of the Institute for Information Law (IvIR), to participants at the University of Lucerne conference on Big Data and ...
Pages	3-19

2019

Exploring the Interfaces Between Big

Data and Intellectual Property Law

by Daniel Gervais*

Everybody may disseminate this ar ticle by electronic m eans and make it available for downloa d under the terms and

conditions of the Digital P eer Publishing Licence (DPPL). A copy of the license text may be obtain ed at http://nbn-resolving.

de/urn:nbn:de:0009-dppl-v3-en8.

Recommended citation: Dani el Gervais, Exploring the Interface s Between Big Data and Intellectual Prope rty Law, 10 (2019)

JIPITEC 3 para 1

Keywords: Copyright; patent; data exclusivity; artificial intelligence; big data; trade secret

ber of legal systems and likely to emerge to allow the

creation and use of corpora of literary and artistic

works, such as texts and images. In the patent field,

AI systems using Big Data corpora of patents and sci-

entific literature can be used to expand patent appli-

cations. They can also be used to “guess” and disclose

future incremental innovation. These developments

pose serious doctrinal and normative challenges to

the patent system and the incentives it creates in a

number of areas, though data exclusivity regimes can

fill certain gaps in patent protection for pharmaceu-

tical and chemical products. Finally, trade secret law,

in combination with contracts and technological pro-

tection measures, can protect data corpora and sets

of correlations and insights generated by AI systems.

Abstract: This article reviews the application

of several IP rights (copyright, patent, sui generis da-

tabase right, data exclusivity and trade secret) to Big

Data. Beyond the protection of software used to col-

lect and process Big Data corpora, copyright’s tradi-

tional role is challenged by the relatively unstructured

nature of the non-relational (noSQL) databases typ-

ical of Big Data corpora. This also impacts the appli-

cation of the EU sui generis right in databases. Mis-

appropriation (tort-based) or anti-parasitic behaviour

protection might apply, where available, to data gen-

erated by AI systems that has high but short-lived

value. Copyright in material contained in Big Data

corpora must also be considered. Exceptions for Text

and Data Mining (TDM) are already in place in a num-

2019

Daniel Gervais

and value.4 “Volume” or size is, as the term Big Data

suggests, the rst characteristic that distinguishes

Big Data from other (“small data”) datasets. Because

Big Data corpora are often generated automatically,

the question of the quality or trustworthiness of the

data (“veracity”) is crucial. “Velocity” refers to “the

speed at which corpora of data are being generated,

collected and analyzed”.

The term “variety” denotes

the many types of data and data sources from which

data can be collected, including Internet browsers,

social media sites and apps, cameras, cars, and a host

of other data-collection tools.

Finally, if all previous

features are present, a Big Data corpus likely has

signicant “value”.

3 The way in which “Big Data” is generated and used

can be separated into two phases.7

First, the creation of a Big Data corpus requires

processes to collect data from sources such as those

mentioned in the previous paragraph. Second, the

corpus is analysed, a process that may involve Text

and Data Mining (TDM).8 TDM is a process that uses

an Articial Intelligence (AI) algorithm. It allows

the machine to learn from the corpus—hence the

term “machine learning” (ML) is sometimes used

as a synonym of AI in the press.9 As it analyses a

Big Data corpus, the machine learns and gets better at

what it does. This process often requires human input

to assist the machine in correcting errors or faulty

correlations derived from, or decisions based on, the

data.

This processing of corpora of Big Data is done

to nd correlations and generate predictions or other

valuable analytical outcomes. These correlations and

4 Jenn Cano, ‘The V’s of Big Data: Velocity, Volume, Value,

Variety, and Veracity’, XSNet (March 11, 2014), <https://

www.xsnet.com/blog/bid/205405/the-v-s-of-big-data-

velocity-volume-value-variety-and-veracity> (accessed 10

December 2018).

5 Ibid.

6 The list includes “cars” as cars as personal vehicles are

one of the main sources of (personal) data—up to 25

Gigabytes per hour of driving. The data are fed back to

the manufacturer. See Uwe Rattay, ‘Untersuchung an vier

Fahrzeugen - Welche Daten erzeugt ein modernes Auto?’,

ADAC, <https://www.adac.de/infotestrat/technik-und-

zubehoer/fahrerassistenzsysteme/daten_im_auto/default.

aspx> (accessed 11 December 2018).

7 The two components are not necessarily sequential. They

can and often do proceed in parallel.

8 See Maria Lillà Montagnani, ‘Il text and data mining e il

diritto d’autore’ (2017) 26 AIDA 376.

9 Cassie Kozyrkov, ‘Are you using the term ‘AI’ incorrectly?’,

Hackernoon (26 May 2018), <https://hackernoon.com/are-

you-using-the-term-ai-incorrectly-911ac23ab4f5>.

10 How IP will apply to the work involved in the human

training function of machine learning is one of the

interesting questions at the interface of Big Data and IP. The

term “training data” is used in this context to suggest that

the machine training is supervised (by humans). See Brian

D Ripley, Pattern Recognition and Neural Networks (Cambridge:

Cambridge University Press, 1996) 354.

A. Introduction

The interfaces between “Big Data” (as the term is

dened below) and IP matters both because of the

impact of Intellectual Property (IP) rights in Big

Data, and because IP rights might interfere with the

generation, analysis and use of Big Data. This Article

looks at both sides of the interface coin, focusing

on several IP rights, namely copyright, patent,

data exclusivity and trade secret/condential

information.1 The paper does not discuss trade

marks in any detail, although the potential role of

Articial Intelligence (AI), using Big Data corpora,

designing and selecting trade marks certainly seems

a topic worthy of further discussion.3

B. Defining Big Data

2 The term “Big Data” can be dened in a number of

ways. A common way to dene it is to enumerate

its three essential features, a fourth that, though

not essential, is increasingly typical, and a fth that

is derived from the other three (or four). Those

features are: volume, veracity, velocity, variety,

* Dr. Gervais is Professor of Information Law at the University

of Amsterdam and the Milton R. Underwood Chair in Law

at Vanderbilt University. The author is grateful to Drs.

Balász Bodó, João Quintais, and to Svetlana Yakovleva of

the Institute for Information Law (IvIR), to participants at

the University of Lucerne conference on Big Data and Trade

Law (November 2018), to Ole-Andreas Rognstad and other

participants at the Data as a Commodity workshop at the

University of Oslo (December 2018), and to the anonymous

reviewers at JIPITEC for most useful comments on earlier

versions of this Article.

1 The Article considers IP rights applied by all or almost

all countries, namely those contained in the Agreement

on Trade-related Aspects of Intellectual Property Rights,

Annex 1C of the Agreement Establishing the World Trade

Organization, 15 April 1994. As of January 2019, it applied

to the 164 members of the WTO, including all EU member

States and the EU itself.

2 This use of the term “corpus” in this context is an extension

of its original meaning as either a “body or complete

collection of writings or the like; the whole body of

literature on any subject”, or the “body of written or spoken

material upon which a linguistic analysis is based”. Oxford

English Dictionary Online (accessed 21 December 2018).

There is a debate about the proper form of the plural. Both

Oxford and Merriam-Webster indicate that “corpora” is

the proper form, although the author has encountered the

form “corpuses” in the literature discussing Big Data. See

e.g., the 2014 White House report to the President from the

President’s Council of Advisors on Science and Technology

titled “Big Data and Privacy: A Technological Perspective”,

at x. “Corpora” is the form chosen here, although the

predicable future is that the perhaps more intuitive form

“corpuses” will win this linguistic tug-of-war.

3 For example, AI systems can create correlations between

trademark features (look, sound etc.) and their appeal, thus

allowing the creation and selection of “better” marks.

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Exploring the Interfaces Between Big Data and Intellectual Property Law

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users