Forward Context-Aware Clickbait Tweet Identification System

Forward Context-Aware Clickbait Tweet Identification System

Rajesh Kumar Mundotiya, Naina Yadav
Copyright: © 2021 |Pages: 12
DOI: 10.4018/IJACI.2021040102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Clickbait is an elusive challenge with the prevalence of social media such as Facebook and Twitter that misleads the readers while clicking on headlines. Limited annotated data makes it onerous to design an accurate clickbait identification system. The authors address this problem by purposing deep learning-based architecture with external knowledge which trains on social media post and descriptions. The pre-trained ELMO and BERT model obtains the sentence level contextual feature as knowledge; moreover, the LSTM layer helps to prevail the word level contextual feature. Training has done at different experiments (model with EMLO, model with BERT) with different regularization techniques such as dropout, early stopping, and finetuning. Forward context-aware clickbait tweet identification system (FCCTI) with BERT finetuning and model with ELMO using glove pre-trained embedding is the best model and achieves a clickbait identification accuracy of 0.847, improving on the previous baseline for this task.
Article Preview
Top

Introduction

Clickbait is a kind of false notifications that are designed as thumbnail and hyperlinks, which leads users to click that link. Clickbait is fundamentally endowed for the web application to increase online readers and viewers and listeners for their content. These clickbaits are kind of fake news, which leads to a variety of threats to democracy, broadcasting, and freedom of composition. Over the news expending continuously moving online, the media growing becomes change. Here is a detailed methodology that can ascribe this change to two main angle. First of all, compared to the classical media such as offline, where the reader's fidelity to a hard-bound newspaper (very limited) was almost inactive, another one is online or e-media offers the readers an extent of alternatives ranging from international, national and local media outlets to vast naive blogs according to their grasp on particular topics according to their interest. In clickbaits, while going through such similar messages, it may get the different hypothesis that sometime is not right or odd about them; it is revealing as unnamed, which leads to the emotional reaction which cannot be accomplished or some knowledge which has not described anywhere. Most of the online media websites are free from subscription charges, and their revenue mostly comes from the advertisements on their web pages. According to the Oxford English Dictionary, clickbait stated as “content on the internet whose main objective is to attract attention and encourage users to click on a link to a particular web page.” Clickbaits are used to lead the user to a link that may require registration to store user information or some payment, which leads page views and other fraudulent activities. Clickbait works on obtaining the use of the curiosity-gap principle. Sometimes compelling headline of the clickbait helps in increasing the curiosity of the readers and thus gets them to click the link to the Web page. For example, consider an entertainment website. An example of a non-clickbait URL force to say, “See how any particular actor/actresses lost 10 pounds last month.” Clickbait headlines for this same news might be:

  • You'll never believe how much weight this celebrity lost in a month!

  • Actor/Actress weight loss secrets finally revealed!

  • Unexpected details of this Actors latest diet!

These types of headlines solely used for social media marketing. Numerous commercial news and entertainment websites present clickbait in addition to genuine web articles. While many other websites challenge that they do not employ clickbait and present enough information, general attendees consider it a widely used strategy as many social media sites are loaded with such hyperlinks. So far, research on fact-checking, message filtering has done, but in the field of clickbait identification is still in a new phase, and it draws more attention in current times Because of the growing extensions of clickbait in social media and news. Researchers are using many different techniques for identification of clickbait, which includes different algorithms based on machine and deep learning concepts.

From feature engineering based machine learning approaches such as Naive Bayes, Random Forest, Support vector machine (SVM), Logistic regression, Decision tree to self-feature extraction based techniques (deep learning) such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long short term memory (LSTM), Ensemble and Hybrid techniques have been used for clickbait identification. After the development of the contextual word embeddings in natural language understanding and processing, the performance on the classification problem such as fact-checking, sentiment analysis also has improved-most common and successful contextual word embeddings generated by the transformer or LSTM. BERT (Bidirectional Encoder Representations from Transformer) designed by Google based on the multi-head self-attention, whereas ELMO (Embeddings from Language Models) are tuned on the bidirectional LSTM. The role of these embeddings is diverse, which comprises (masked) language model generation, sentence generation, feature extractor, and sense disambiguation.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 6 Issues (2022): 1 Released, 5 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing