A machine learning-based methodology to predict learners’ dropout, success or failure in MOOCs

Date02 December 2019
Published date02 December 2019
Pages489-509
DOIhttps://doi.org/10.1108/IJWIS-11-2018-0080
AuthorYoussef Mourdi,Mohamed Sadgal,Hamada El Kabtane,Wafaa Berrada Fathi
Subject MatterInformation & knowledge management,Information & communications technology,Information systems,Library & information science,Information behaviour & retrieval,Internet
A machine learning-based
methodology to predict learners
dropout, success or failure
in MOOCs
Youssef Mourdi,Mohamed Sadgal,Hamada El Kabtane and
Wafaa Berrada Fathi
Department of Computer Science, Faculté des Sciences Semlalia Marrakech,
Cadi Ayyad University, Marrakech, Morocco
Abstract
Purpose Even if MOOCs (massive open online courses)are becoming a trend in distance learning, they
suffer from a very high rate of learnersdropout, and as a result, on average, only 10 per cent of enrolled
learners manage to obtaintheir certicates of achievement. This paper aims to give tutors a clearervision for
an effectiveand personalized intervention as a solution to retaineach typeof learner at risk of dropping out.
Design/methodology/approach This paper presents a methodology to provide predictions on
learnersbehaviors. This work,which uses a Stanford data set, was divided into several phases, namely, a
data extraction,an exploratory study and then a multivariate analysis to reduce dimensionalityand to extract
the most relevant features.The second step was the comparison between ve machine learningalgorithms.
Finally, the authors used the principle of association rules to extract similarities between the behaviors of
learnerswho dropped out from the MOOC.
Findings The results of this work have giventhat deep learning ensures the best predictions in terms of
accuracy, which is an averageof 95.8 per cent, and is comparable to other measures such as precision, AUC,
Recall and F1 score.
Originality/value Many research studieshave tried to tackle the MOOC dropout problem by proposing
different dropout predictivemodels. In the same context, comes the present proposal with which the authors
have tried to predictnot only learners at a risk of dropping out of the MOOCs butalso those who will succeed
or fail.
Keywords MOOC, Association rules, Machine learning
Paper type Research paper
1. Introduction
The massive open online courses(MOOCs ) are a new model that has revolutionizedthe eld
of distance learning since its foundation in 2008 by George Siemens (El-Hmoudova, 2014).
The principle is very simple: open the trainingto anyone regardless of their academic level
or their prerequisites (Sanchez-Gordon and Luján-Mora, 2016). Currently, universities are
increasingly integrating MOOCs, to not only provide training to geographically distant
This research was done through Stanford Universitys Center for Advanced Research through Online
Learning (CAROL); the authors are thankful for all the facilities provided. They also wish to express
their full gratitude to Kathy Mirzaei for her responsiveness and collaboration. The authors wish to
warmly thank Mitchell Stevens, Director of Digital Research and Planning, as well as all in the
CAROL Commission for the trust shown to the authors.
Machine
learning-based
methodology
489
Received23 November 2018
Revised7 January 2019
Accepted28 January 2019
InternationalJournal of Web
InformationSystems
Vol.15 No. 5, 2019
pp. 489-509
© Emerald Publishing Limited
1744-0084
DOI 10.1108/IJWIS-11-2018-0080
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/1744-0084.htm
learners but also address issues relatedto the exponential mass of students, as is the case in
developing countries such as Morocco, Tunisia or other countries. In this context, the
MOOCs can also be a tool to complete the initial training, part of which is provided face to
face (hybrid education). MOOCs are not conned to only universities, training companies
offer, in the form of certied MOOCs, freeor paid courses such as Udemy, Coursera, Udacity
and many others (Yuan and Powell, 2013).
Despite their successes,the growth in the number of courses offered and the efforts made
to develop dedicated platforms, the MOOCs have, however,a very appropriate problem that
cannot be neglected: the huge number of learnersdropout (Onah et al.,2014). According to
Liyanagunawardena et al. (2014), only 10 per cent of enrolled learners are able to complete
courses, and therefore, MOOCs far exceed traditional online courses in terms of dropout
rates. For example, a software engineering course offeredby MIT and Berkeley University
received 50,000 registrations but just 7 per cent completed the course (Yuan and Powell,
2013). Another feedbackis from the Open Learning Design Studio of The OpenUniversity of
UK (Cross, 2013), in which the authors studied a coursein which 2,420 student enrolled and
only 50 per cent consulted at least one page of the course during the rst week. They also
noted that no more than 30 participantswere active learners, and only 22 people completed
the course, 50 per cent of whom wereable to achieve the courses objective. Onah et al. (2014)
cited the experience of Duke Universitythat launched a Bioelectricity MOOC, a course that
received 12,175 registrations. Despite this huge number of registrations, only 7,761 learners
(representing 64 per cent of all learners) followedat least one video, 26 per cent answered a
quiz and only 2.6 per cent completed the course.
With the aim to cope with this scourge,a large number of research studies and surveys
have been launched with the ultimate goal to identify the reasons and causes that drive
learners to quit MOOCs. Barak et al. (2016)concluded that language and free MOOCs can be
important factors that drivelearners to drop out. On the other hand, Hone and El Said (2016)
found that the lack of self-motivation among learners and the low quality of learner/
instructor interaction may also cause learners to not complete their courses. Kizilcec et al.
(2017) indicated that the lack in supportand orientation of learners in MOOCs causes a very
high dropout rate.
MOOC platforms generate ample data whose analysis can return relevant indicators
about students dropping out and therefore predictive barometers. Being an important
research topic, several researchers have developed a variety of predictive models by
adopting supervised,unsupervised and semi-supervised machine learning architecturesand
algorithms (Feiand Yeung, 2015).
Predicting learners at risk of quitting the MOOC is an important task, which will allow
MOOC leaders and tutors to present a personalizedintervention to this category of learners.
In this context, it is estimated that in a MOOC, we can nd other types of learners, namely,
those at risk of failing and those who will succeed,categories that have not been considered
by the majority of researchers in their proposals or just taken into account separately
(Moreno-Marcos et al., 2018). Classifying learners into three categories will provide tutors
with a more rened view and therefore a rational and effectiveintervention to enable these
learners to follow the MOOC easily. By taking into consideration these three categories of
learners, this paper presents the results of a research that was done on the data set of a
MOOC proposed by Stanford University that had 3,585 enrolledlearners. This research was
divided into four major phases, the rstincluding extraction, preparation and explorationof
data; the second phase was looking for the most predictivefeatures to use when generating
the predictive model; the third phase concerns a comparison of ve machine learning
IJWIS
15,5
490

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT