Semiautomated process for generating knowledge graphs for marginalized community doctoral-recipients

Document

Cited in

DOI	https://doi.org/10.1108/IJWIS-02-2022-0046
Published date	13 October 2022
Date	13 October 2022
Pages	413-431
Subject Matter	Information & knowledge management,Information & communications technology,Information systems,Library & information science,Information behaviour & retrieval,Metadata,Internet
Author	Neha Keshan,Kathleen Fontaine,James A. Hendler

Semiautomated process for

generating knowledge graphs for

marginalized community

doctoral-recipients

Neha Keshan,Kathleen Fontaine and James A. Hendler

Department of Computer Science, Tetherless World Constellation,

Rensselaer Polytechnic Institute, Troy, NY

Abstract

Purpose –This paper aims to describe the“InDO: Institute Demographic Ontology”and demonstrates the

InDO-based semiautomated process for both generating and extending a knowledge graph to provide a

comprehensive resourcefor marginalized US graduate students. The knowledge graph currentlyconsists of

instances relatedto the semistructured National Science Foundation Surveyof Earned Doctorates (NSF SED)

2019 analysis reportdata tables. These tables contain summary statisticsof an institute’s doctoral recipients

based on a variety of demographics. Incorporating institute Wikidata links ultimately produces a table of

unique, clearlyreadable data.

Design/methodology/approach –The authors usea customized semantic extract transform and loader

(SETLr) script to ingest data from 2019 US doctoral-granting institute tables and preprocessed NSF SED

Tables 1, 3, 4 and 9. The generatedInDO knowledge graph is evaluated using two methods.First, the authors

compare competency questions’sparql results from both the semiautomatically and manually generated

graphs. Second,the authors expand the questions to provide a betterpicture of an institute’s doctoral-recipient

demographicswithin study ﬁelds.

Findings –With some preprocessing and restructuring of the NSF SED highly interlinked tables

into a more parsable format, one can build the required knowledge graph using a semiautomated

process.

Originality/value –The InDO knowledge graph allows the integrationof US doctoral-granting institutes

demographic data based on NSF SED data tables and presentation in machine-readable form using a new

semiautomatedmethodology.

Keywords Semiautomation process, Knowledge graphs, Institute demographics,

Graduate mobility, NSF doctoral recipients survey data

Paper type Research paper

This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may

reproduce, distribute, translate and create derivative works of this article (for both commercial and

non-commercial purposes), subject to full attribution to the original publication and authors. The full

terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode

This work is part of the vision of “Building a Social Machine for Graduate Mobility”. We would

like to thank Dean Stanley Dunn, Dean of Graduate Education, who provided expert insights into this

issue. We would like to thank the members of the Tetherless World Constellation Lab at Rensselaer

Polytechnic Institute, especially John S. Erickson and Jamie P. McCusker, who provided insights and

expertise that greatly assisted this research. This work was funded in part by the RPI-IBM AI

Research Collaboration, a member of the IBM AI Horizons network.

Marginalized

community

doctoral-

recipients

413

Received28 February 2022

Revised1 June 2022

Accepted5 July 2022

InternationalJournal of Web

InformationSystems

Vol.18 No. 5/6, 2022

pp. 413-431

EmeraldPublishing Limited

1744-0084

DOI 10.1108/IJWIS-02-2022-0046

The current issue and full text archive of this journal is available on Emerald Insight at:

https://www.emerald.com/insight/1744-0084.htm

1. Introduction

Newly minted doctoral students face a long-standing problem of “what comes next?”.

This transition from being a student toward their chosen career path is referred to as

graduate mobility (Keshan, 2021). The preparation for graduate mobility does not start

when one is approaching graduation but rather much earlier, perhaps even as early as

the time of program selection. To help students have a smooth transition fromgraduate

school to their career, it is important for them to have an adequate amount of

information for doctoral graduate school selection. The information should include the

demographics of past doctoral recipients and the career paths they chose. Students can

use this information along with the general program ranking to make an informed

decision about which graduate program to join. Therefore, the question of “what comes

next”is connected to the question of “where is the best for me?”during a doctoral

program selection (Keshan et al., 2021). In general, doctoral programs are challenging

for all students but can be especially challenging for students from marginalized

communities –groups of students traditionally under-represented based on ethnicity,

race, language, gender identity, age, physical ability and/or immigration status (Gay,

2004;Sevelius et al., 2020). It has been shown that marginalized students have to go the

extra mile to prove their worth.

Previous work (Keshan et al., 2021) proposed an Institute Demographic Ontology

(InDO) designed to help with this problem. The ontology was m ainly generated

manually using a traditional methodology (Kendall and McGuiness, 2019).Thispaper

builds on that work by describing a new, semiautomated process for generating an

Institute Demographic knowledge graph, based on the InDO ontology, to integrate the

various NSF SED survey results statistical data (Foley, 2021). Notably, National

Science Foundation (NSF) recently (Dec 2021) launched the “Survey of Earned

Doctorates Restricted Data Analysis System”(SED RDAS), which allows users to

create their own tables for SED data from 2017 to 2020. In this restrictive model,

security protocols in the NSF system do not allow the user to acquire institute-speciﬁc

demographics with respect to the year. However, the institute-speciﬁc data is available

through the NSF website as part of their SED analysis results across multiple tables for

the years 1958 to 2020. These tables (Figure 1) can be integrated with one another using

semantic techniques without compromising privacy to make the statistical data more

machine-readable and, therefore, more accessible, providing a more comprehensive

picture of any US doctorate-granting institute’s demographics. This system integrates

the available institute data from the provided results table without compromising

student privacy.

In this paper, we describe a semiautomated linked-data representation of the NSF SED

statistical data, knowledge representation of this statistical, demographic data and the

usefulness of linking it with Wikidata [1]. Wikidata is a free and open knowledge base that

can be processed by both humans and machines.The content of Wikidata, available under a

free creative commons license, is interlinkable to other open data sets on the linked data

Web. Our current InDO-based semiautomatically generated knowledge graphincludes data

points from Tables 1, 3, 4 and 9 of the published NSF SED 2019 analysis results. One

hundred and ninety-four of the 448 doctoral-granting US institutes have their respective

Wikidata nodes added to allow users to access our resources in conjunction with other

linked data already available on the Web. Finally, as part of the evaluation, we compared

blazegraph workbench results obtained from the semiautomatically generated knowledge

graph and the manually generated knowledge graph. We also added new competency

questions to provide a better picture of an institute’s demographic based on broad study

IJWIS

18,5/6

414

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Semiautomated process for generating knowledge graphs for marginalized community doctoral-recipients

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users