Noise-Regularized Bidirectional Gated Recurrent Unit With Self-Attention Layer for Text and Emoticon Classification

Noise-Regularized Bidirectional Gated Recurrent Unit With Self-Attention Layer for Text and Emoticon Classification

Mohan Kumar A. V., Nandakumar A. N.
Copyright: © 2022 |Pages: 22
DOI: 10.4018/IJeC.299007
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The emoji are capable of expressing emotion beyond the meaning of the text by displaying visual emotions, which makes the content more distinct. Recently, emoji and text prediction has gained more significance, since it is hard to choose the appropriate one from thousands of emoji candidates. The small-sized dataset provides a poor description of features that resulted in classification and showed overfitting and underfitting problems. Therefore, Noise Regularized Bidirectional Gated Recurrent Unit (Bi-GRU) with Self-Attention Layer (SAL) is proposed for the classification of text and emoji. The proposed Noise Regularized Bi-GRU which is an aspect-based sentiment analysis performs a series of experiments on Twitter data to predict the sentiment of a tweet. The proposed Noise Regularized BGRU with SAL method obtained an accuracy of 87.77 % better when compared to the deep learning model that obtained an accuracy of 86.27 %.
Article Preview
Top

Introduction

Emoticons play an important role in social media, especially in text-based interactions, where communication tends to be easier without affective cues. For example, a study has shown that in socio-emotional contexts, emoticons are more commonly used than in task-oriented contexts (Urabe et al. 2021). One of the reasons emoji have become such a sensation is because communicating by text has become more and more a part of people's lives (Yang et al. 2020). Social media is the biggest platform for information sharing using all sorts of multimodal videos, emoji, text, etc., which is used for social settings (Ullah et al. 2020). The communication channel serves an important role as it acts as a feedback tool and social listening in social media. The online social media sites provide applications such as YouTube, Twitter, Facebook tools for expressing ones’ feedback irrespective of any event (Lei et al. 2020). The main key technique used for performing sentiment analysis is, unlocking the user opinion. The machine learning techniques consider user opinions as it is difficult to analyze those big data in decision making (Liao et al. 2020). The generic text classification is important that indispensably relied on human emotions and human language are expressed for social media posts. Various types of human emotion will provide an effective spectrum (Zhao et al. 2020). The expressions of feelings or emotions outwards provide an opinion based on the texts or comments (Tellez et al. 2018). The unrelated features were present in the models created overfitting problems, which lowered the performances of the results (Venkataramaiah and Achar 2020).

The contributions of the research work are as follows:

  • To develop an aspect-based feature extraction process that extracts Word2vec, Continuous Bag-Of-Words (CBOW), Skip-gram, Smooth Inverse Frequency, and cosine similarity features to determine the opinions of users.

  • To add Gaussian noise, the network layer, as the input layer analyze and improve the performance as it creates more amount of data samples and hence, makes the process of data distribution much smoother.

  • To develop Bi-GRU model to learn the general features and hence, lowers the number of generalization errors during sentiment analysis.

The proposed research uses an aspect-based feature extraction process that extracts Word2vec, Smooth Inverse Frequency, and cosine similarity features (Mohankumar and Nandkumar 2020). The feature selection process uses only relevant features that overcome the overfitting problem (Siddappa and Kampalappa 2019). This research study uses the aspect-based sentiment analysis on Twitter data to determine the opinions of users. In general, a raw tweet contains stop words, URLs, and emoji that are reduced in the stage of pre-processing (Kumar et al. 2020). The polarities of pre-processed tweets are identified, and then two important feature extraction techniques are used to extract the useful information (Shi 2019). Specific sentiments with various aspects are extracted and given as an input using aspect-based feature extraction (Vijayaragavan et al. 2020; Jayashree and Anitha 2020). The results of the proposed method are evaluated in terms of accuracy, precision, recall, F-score, Mean Square Error (MSE), and Root Mean Square Error (RMSE).

The organization of the paper is as follows: section 2 is the literature review of the existing methods. Section 3 explains the proposed Bidirectional GRU methodology and the steps involved in the proposed method. Section 4 describes the results and discussions that are evaluated quantitatively and comparatively. Section 5 describes the conclusion and future work.

Top

Literature Review

The existing researches involved for emoticons classification are as follows:

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 7 Issues (2023)
Volume 18: 6 Issues (2022): 3 Released, 3 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing