Article Preview
TopIntroduction
Using technology to annoy, intimidate, shame, or target another individual is called cyberbullying. Threats made online and meant offensive or disrespectful emails, comments, blogs, or notifications are all considered under this. Posting personal information, photos, or videos intending to harm or shame another person is also prohibited. Cyberbullying often involves images, tweets, or web pages that are not taken down until the user has called for them to be removed. When discussing adolescence, one must realize that it is an increasingly sensitive period of one’s life, where one is vulnerable to external duress. Discrimination is described as intimidation or derogatory remarks directed at a person's gender, sexuality, sexual identity, ethnicity, or physical distinctions and is illegal in several states. As a result, the police could get involved, and bullies may suffer drastic consequences (Ben-Joseph, 2018). Although several scholars have studied the impact of cyberbullying on teens and attempted to develop automated tools for detecting cyberbullying, those techniques have failed to take into account the vastly different social media world that teenagers currently live in, which is unlike the one that existed even five or ten years ago. Teenagers are well-known for their prolific use of image and video-sharing applications and limited-time tweets. Visual content, in particular, accounts for more than 70% of all online traffic. Around the same time, image and video imagery use for cyberbullying has increased significantly, with some claiming that “cyberbullying grows bigger and meaner with images and videos.” In reality, the growing prevalence of image and multimodal content for cyberbullying was one of the major themes found in recent cyberbullying reports. Although it is widely recognized that decoding multimodal content is critical for cyberbullying detection, the cyberbullying detection literature is still primarily based on (sophisticated) text processing, and their accuracy is minimal. There are currently few projects that use visual features to spot cyberbullying. Understanding cyberbullying trends and preventing them using suitable Machine Learning algorithms could help numerous school students lead better lives and make better decisions, which help them grow and flourish into capable future leaders. Hence, this research paper aims to focus on adolescent girls using various tools and techniques like Text Analytics and Image Analytics (Reynolds et al, 2011). Hate speech tends to be an offensive form of interaction in which a hate agenda is expressed through misconceptions. Hate speech targets protected characteristics such as gender, sexuality, race, and disability. As a result of hate speech, unwelcome crimes can result from someone or a group of people being disheartened. The real-world data can be extracted using appropriate data mining algorithms to find hidden patterns and then conduct the analyses required to understand the psychology of girls and boys and the tonality and voice of the tweets/posts. Understanding psychology, color, and personality traits will help draw insights from the expressions collected. The authors will be studying the sample's user bios, likes, and comments using a lexical and syntactical approach. Since the data is extracted using Twitter, i.e., a secondary data source, the authors will address the gap in current psychological analyses. They understood the extracted database and ensured that the authors looked at textual data and heavily focused on geospatial locations and images. It is a known fact that 70% of web-based social media websites' content comprises images. Hence, it is essential to focus not just on the posts or captions but also on the images to get a clear picture of the online scenario. Girls are much more vulnerable to perceiving negative comments and taking them negatively and seriously, which is more likely to harm their mental health. This severely impacts the quality of their mental health and hinders them from achieving their potential to the fullest. It is also noted that cyberbullying is a phenomenon that has been around for a while, yet very few literature pieces focus on the research gap taken up by the authors. Through this article, the authors will comprehensively understand and scrape through the respondents' profiles. They will ensure that they can obtain all the information about the users through their profiles, assess textual, social, and visual clues to form their analysis, and finally declare a tweet flagged due to its explicit content. They will be using Machine Learning algorithms for the analysis and create a system that constantly keeps learning – one system that can change the life of not just one adolescent but many more. Such a comprehensive methodology will try to eliminate the need for self-administered questionnaires that are subject to responder bias and are used widely worldwide to understand practices like cyberbullying, cyber victimization, etc. A self-administered test will enable the respondents to choose the option that applies to them most manually. This leads to an unknown bias between the respondent's thoughts and how he is. Such a system can constantly keep learning and eliminate this bias, thereby providing a clear picture of the internet Twitter scenario. The authors will use a corpus from data scraping via Twitter and refine their results. Once the authors have the right sample size and population, the next step is to ensure the data is pre-processed and ready for analysis. In this paper, they will use other techniques on numerical datasets, like transformation, to get a balanced dataset that provides accurate results. Once this is complete, the next phase would be to move on to a number of machine learning models and choose the one that provides the most accurate results. Extensive experimental evaluations of real-world multimodal social network datasets demonstrate and validate the fact that the authors' approach outperforms current cyberbullying identification models. They will concentrate on the data collection and feature engineering process, emphasizing feature selection algorithms before employing a variety of machine learning algorithms to predict cyberbullying behaviors. Finally, the problems and obstacles have been identified, presenting new investigative avenues for researchers to investigate. The authors will focus on deepening the role of ML in cyberbullying detection and prevention. Specifically, the following issues (Angelis & Perasso, 2020) are addressed: