Fake news detection on Twitter

DOIhttps://doi.org/10.1108/IJWIS-02-2022-0044
Published date19 September 2022
Date19 September 2022
Pages388-412
Subject MatterInformation & knowledge management,Information & communications technology,Information systems,Library & information science,Information behaviour & retrieval,Metadata,Internet
AuthorSrishti Sharma,Mala Saraswat,Anil Kumar Dubey
Fake news detection on Twitter
Srishti Sharma
Department of Computer Science and Engineering, The NorthCap University,
Gurugram, Haryana, India
Mala Saraswat
School of Computing Science and Engineering, Galgotias University,
Greater Noida, India, and
Anil Kumar Dubey
ABES Engineering College, Ghaziabad, India
Abstract
Purpose Owing to the increased accessibility of internet and rela ted technologies, more and more individuals
across the globe now turn to social media for their daily dose of news rather than tradi tional news outlets. With the
global nature of social media and hardly any checks in place on posting of content, exponential increase in spread of
fake news is easy. Businessespropagate fake news to improve their economic standing and inuencingcon sumers
and demand, and individuals spread fake news for personal gains like popularity and life goals. The content o ff ake
news is diverse in terms of topics, styles and media platforms, and fake news attempts to distort truth with diverse
linguistic styles while simultaneously mocking true news. All these factors together make fake news detection an
arduous task. This work tried to check the spread of disinformation on Twitter.
Design/methodology/approach This study carriesout fake news detection using user characteristics
and tweet textual content as features. For categorizing user characteristics, this study uses the XGBoost
algorithm. To classifythe tweet text, this study uses various natural languageprocessing techniques to pre-
process the tweets and then apply a hybrid convolutional neural networkrecurrent neural network (CNN-
RNN) and state-of-the-artBidirectional Encoder Representationsfrom Transformers (BERT) transformer.
Findings This study uses a combinationof machine learning and deep learning approachesfor fake news
detection, namely, XGBoost, hybrid CNN-RNN and BERT. The models have also been evaluated and
comparedwith various baseline models to show that this approacheffectively tackles this problem.
Originality/value This study proposes a novel framework that exploits news content and social
contexts to learn useful representations for predicting fake news. This model is based on a transformer
architecture,which facilitates representation learning from fake newsdata and helps detect fake news easily.
This study also carries out an investigativestudy on the relative importance of content and social context
features for the task of detecting false news and whether absence of one of these categories of features
hampers the effectiveness of the resultant system. This investigation can go a long way in aiding further
researchon the subject and for fake news detection in the presence of extremely noisy or unusabledata.
Keywords Fake news, Transfer learning, Classication, Transformers, Gradient boosting,
Text classication, Twitter
Paper type Research paper
1. Introduction
The easy accessibility of the internet combined together with the increasing number of social
media channels is perhaps the main reason behind the propagation of fake news at an
unparalleled and unimaginable rate. Content makers across the globe now have new channel s to
The authors thank Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee and Huan Liu
from FakeNewsNet for making their data available, enabling this research.
IJWIS
18,5/6
388
Received28 February 2022
Revised27 June 2022
Accepted1 August 2022
InternationalJournal of Web
InformationSystems
Vol.18 No. 5/6, 2022
pp. 388-412
© Emerald Publishing Limited
1744-0084
DOI 10.1108/IJWIS-02-2022-0044
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/1744-0084.htm
fabricate and propagate imaginary articles to build readership or as part of psychological
warfare, nancial and political gain. News of questionable credibility erodes the condence that
everyone has of all news sources, both legitimate and illegitimate. The developing use of
algorithms in robotized news circulation and creation has made it easy and inexpensive to
provide news online at a fast pace. Gartner analysis predicts that By the year 2022, an
overwhelming number of individuals in technologically advanced countries will devour more
false knowledge than real information[1]. Social media can be a dangerous outlet if mishandled,
abused or attacked. The spread of any kind of information through social media platforms,
whether false or not, is so quick, that it can lead to grave and irreversible damage in practically no
time. It is palpable that one of the utmost well-known fake news was considerably more rapidly
diffused on a popular social networking giant than the most renowned credible standard
newsash at the time of ballots in the USA in 2016 (Allcottand Gentzkow, 2017). Approximately
62% people in the USA admitted to getting newscasts via social networks in 2016, whereas in
2012, only 49% testied to obtaining updates via social media. It has also been revealed that
social networking sites on the internet now outrank TV as the signicant news source [2].
The issue of spreading disinformation through internet warrants immediate consideration.
Any endeavours to misinform or troll on the internet via spread of disinformation or
disingenuous content sources are now considered grave matters that warrant sincere efforts from
researchers in this eld. Therefore, there is an urgent need to develop a system for detecting and
ltering false content. Building such a system is paramount because it can assist both
newsreaders and tech companies. As the dynamic nature and different styles of false news are
major hurdles, we aim to propose a fake news detection scheme that takes into account user
characteristics, content and social conditions. This hybrid approach should provide a robust a nd
effective system for efciently combating fake news epidemics in the early stages of proliferation.
This article provides anoverview of a fake news detection system that uses tools to detect and
eliminate fake websites from results returned by search giants or news applications. It can be
downloaded and added to the users browser or any app that the user uses to get the news feed.
Our major contributionsare outlined as follows:
We propose a novel framework that exploits news content and social contexts to
learn useful representations for predicting fake news. Our model is based on a
transformer architecture, which facilitates representation learning from fake news
data and helps us detect fake news easily. We also use the side information
(metadata) from the news content and the social contexts to support our model to
classify the truth better. Through this work, we propose a fake news architecture
comprising of deep learning and machine learning (ML) (Bidirectional Encoder
Representations from Transformers [BERT] and XGBoost) that can be applied to a
wide range of scenarios, languages and platforms.
We use both news content and social context-based features and conduct an investigative
study into which of these categories of features plays the most important role in detecting
fake news, and if one of these features are unavailable or inadequate in the early stage of
news propagation, whether the effectiveness of the resultant system will be affected.
Many researchers have used multiple categories of features but to the best of our
knowledge, they have not carried out a similar investigation. This investigation can go a
long way in aiding further research on the subject and for fake news detection in the
presence of extremely noisy or unusable data.
In our study, we attempt to develop a model for fake news classication using deep
learning and ML that produces better outcome when compared with the previous
studies.
Fake news
detection
389

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT