Grey Wolf Shuffled Shepherd Optimization Algorithm-Based Hybrid Deep Learning Classifier for Big Data Classification

Grey Wolf Shuffled Shepherd Optimization Algorithm-Based Hybrid Deep Learning Classifier for Big Data Classification

Chitrakant Banchhor, Srinivasu N.
Copyright: © 2022 |Pages: 20
DOI: 10.4018/IJSIR.302612
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In recent days, big data is a vital role in information knowledge analysis, predicting, and manipulating process. Moreover, big data is well-known for organized extraction and analysis of large or difficult databases. Furthermore, it is widely useful in data management as compared with the conventional data processing approach. The development in big data is highly increasing gradually, such that traditional software tools faced various issues during big data handling. However, data imbalance in huge databases is a main limitation in the research area. In this paper, the Grey wolf Shuffled Shepherd Optimization Algorithm (GWSSOA)-based Deep Recurrent Neural Network (DRNN) algorithm is devised to classify the big data. In this technique, for classifying the big data a hybrid classifier, termed as Holoentropy driven Correlative Naive Bayes classifier (HCNB) and DRNN classifier is introduced. In addition, the developed hybrid classification model utilizes the MapReduce structure to solve big data issues. Here, the training process of the DRNN classifier is employed using GWSSOA. However, the developed GWSSOA is devised by integrating Shuffled Shepherd Optimization Algorithm (SSOA) and Grey Wolf Optimizer (GWO) algorithms. The developed GWSSOA-based DRNN model outperforms other big data classification techniques with regards to accuracy, specificity, and sensitivity of 0.966, 0.964, 0.870, and 209837ms.
Article Preview
Top

1. Introduction

The word big data contains a massive amount of data (Lozada et al., 2019), and it can be unstructured or structured. The major significant aspect is the organization in data processing, which utilizes big data (Banchhor & Srinivasu, 2020; Tabesh et al., 2019). The most important advantage of big data is time-consuming, huge data processing, cost-saving during the analysis and prediction, along with it obtain more effectiveness because of advanced tool support (Fong et al., 2015; Dubey et al., 2020; Rashmi & Sarvagya, 2020). Moreover, the data production is huge owing to the high utilization of the internet, which concludes in immense data termed as big data. The behaviours of big data are 5V’s usually termed as volume –excessive data size, veracity – consistency, superiority and preciseness of data, velocity – data speed, variety – unstructured, semi-structured or structured, value –data value (Ramsingh & Bhuvaneswari, 2018; Sathyaraj et al., 2020; Jadhav & Gomathi, 2019; Kulkarni & Senthil Murugan, 2019). The major hypothesis of the more recent research area in big data is which has more data for enabling enhanced perceptions. However, smart data technologies are very useful to the above issue (Triguero et al., 2019; Prasanalakshmi et al., 2011). However, complicated big data pre-processing techniques would also not be necessary, for instance during the high levels of redundancy identification (Maillo et al., 2020).

The target class is identified precisely in big data classification through assigning the things in a group. The approach, namely genetic programming, Genetic Algorithms (GA), Bayes networks, Decision Trees (DT) are used to classify the big data (Arnaiz-González et al., 2017). The Apache Spark structure is employed in huge-scale data processing (Dean & Ghemawat, 2004). In addition, instance enabled learning, named the Lazy learning model is included in the supervised learning process (Aha, 1997; Duraisamy et al., 2019). Imbalanced dataset issues in big data are widely conquer by means of Fuzzy Rule-based Classification System (FRBCS) (L´opez et al., 2014), which is termed as Chi-FRBCS Big Data CS (Mujeeb et al., 2020). Furthermore, there are several metrics, which mainly measures three things, namely complexity (Garcia et al., 2018) which is defined as trouble in unseen sample classification, redundancy (Maillo et al., 2020), which defines the availability of instances, and density (Sugiyama et al., 2012) it indicates a large amount of instances in relative to the domain of trouble (Muslea et al., 2000). However, these three metrics are widely employed in auto-machine learning domain (Hutter et al., 2019) in which extracted features from the database assist to identify better pipelines that are the integration of learning and pre-processing algorithms (Feurer et al., 2019). Additionally, these metrics are introduced for solving big data issues (Lorena et al., 2019), and quality measure issues in huge databases. Besides, various challenges are raised during the implementation, which are difficulty metrics-driven non-linearity of classifier in sequential classification algorithm (Hoekstra & Duin, 1996; Maillo et al., 2020), and density metrics rely on clipping of entirely related graphs (Garcia et al., 2015; Maillo et al., 2020).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 3 Issues (2023)
Volume 13: 4 Issues (2022)
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing