A Comparative Study of Meta-Heuristic and Conventional Search in Optimization of Multi-Dimensional Feature Selection

A Comparative Study of Meta-Heuristic and Conventional Search in Optimization of Multi-Dimensional Feature Selection

Khin Sandar Kyaw, Somchai Limsiroratana, Tharnpas Sattayaraksa
Copyright: © 2022 |Pages: 34
DOI: 10.4018/IJAMC.292517
Article PDF Download
Open access articles are freely available for download

Abstract

Algorithmic – based search approach is ineffective at addressing the problem of multi-dimensional feature selection for document categorization. This study proposes the use of meta heuristic based search approach for optimal feature selection. Elephant optimization (EO) and Ant Colony optimization (ACO) algorithms coupled with Naïve Bayes (NB), Support Vector Machin (SVM), and J48 classifiers were used to highlight the optimization capability of meta-heuristic search for multi-dimensional feature selection problem in document categorization. In addition, the performance results for feature selection using the two meta-heuristic based approaches (EO and ACO) were compared with conventional Best First Search (BFS) and Greedy Stepwise (GS) algorithms on news document categorization. The comparative results showed that global optimal feature subsets were attained using adaptive parameters tuning in meta-heuristic based feature selection optimization scheme. In addition, the selected number of feature subsets were minimized dramatically for document classification.
Article Preview
Top

Introduction

Recently, document classification has become a main technology that deals with knowledge discovery process in various applications such as business intelligence model, medical intelligence model, social media intelligence model, and so on. The performance of document classification mainly depends on the quality of selected feature subset from the feature vector. Therefore, feature selection has become a major requirement to ensure relevant feature for the classification model (Kotsiantis, 2014). Selection of optimal feature subset from high dimensionality data for accurate classification model is becoming a tough computational research gap. Furthermore, text feature selection can be regarded as NP-hard problem (Abdollahi et al. 2019) because the number of feature combinations escalate exponentially for multi-dimensional data.

In the operation level, there are four main types of feature selection method (Remeseiro & Bolon-Canedo, 2019) such as filter (Cherrington et al. 2019), wrapper (El Aboudi & Benhlima, 2016), embedded (Hameed et al. 2018), and hybrid (Solorio-Fernández et al, 2019). Meanwhile, feature exploration level includes two optional feature searches approach such as an algorithmic based conventional search (Appel, 2014) and heuristic based intelligence search (Sharma & Kaur, 2020). Although several feature selections schemes exist, many employ brute force or exhaustive search (Xue et al., 2016) in which all features combination are considered exhaustively and insufficient for high-dimensional feature selection problem. To overcome this limitation, meta-heuristic based optimization algorithms (Beheshti & Shamsuddin, 2015) provides solution for non-linear, high-dimensional complex feature selection because they may provide global optimal feature subset using randomization and heuristic-based search capability. In the capability of meta-heuristic scheme, decentralization and randomization are performed by all groups of meta-heuristic algorithm for searching task. However, the objective function is the major driver of meta-heuristic search mechanism for specific application problem. In contrast, conventional algorithmic-based search such as best first search (BFS) (Clausen & Perregaard, 1999), greedy stepwise search (GSS), and ranker search (RS) (Drotár et al. 2019), uses exhaustive search in which only the best scored features are selected locally and therefore prone to bias for feature selection from classes with rich feature scores.

In addition, as data science evolves, meta-heuristic intelligence has gained grounds for studying the characters of complex data. Because of the adaptive search mechanism of meta-heuristic approach, it is relevant for exploring optimal feature by striking a balance between exploitation and exploration search scheme. Search results are then plugged with the objective function of the feature selection scheme and learning models to evaluate the quality of selected feature subsets which matched or at variance with the cost function of the specific problem. Meta-heuristic algorithms such as bat search (Yang, 2013), cuckoo search (Shehab et al. 2017), flower pollination search (Abdel-Basset & Shawky, 2019), firefly search (Alomoush et al., 2018) mimic nature-inspired search. In the nature-inspired search process, all agents search randomly in different directions, then shares information and experience regarding searching and compare their results for better outcome. Furthermore, the search process of individual organism in nature is performed based on collaboration (de-centralization) and are very helpful to reduce work overloaded and enhance output (solution).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing