GABC: A Hybrid Approach for Feature Selection Using Artificial Bee Colony and Genetic Operators

GABC: A Hybrid Approach for Feature Selection Using Artificial Bee Colony and Genetic Operators

Bindu M. G., Sabu M. K.
Copyright: © 2021 |Pages: 18
DOI: 10.4018/IJSIR.2021070104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Feature selection is a complex pre-processing step in data mining that enhances classification accuracy by selecting the minimum number of relevant features. Artificial bee colony algorithm (ABC) is one of the successful swarm intelligent algorithms for feature selection, image processing, data analytics, protein structure prediction, etc. It simulates the honey foraging behavior of the bee swarm. But it tends to low convergence speed and local optima stagnation. Hybrid meta-heuristics can enhance the performance of existing swarm algorithms. This paper proposes a hybrid approach for the ABC algorithm by incorporating genetic operators into it. The mutation operator is used to explore the better-quality neighborhood while the crossover is used to enhance the quality of solutions by implementing diversity into them. The performance of the proposed method is evaluated using UCI data sets and compared with existing swarm algorithms for feature selection. The effectiveness of the proposed method is evident from the results.
Article Preview
Top

Introduction

Data analysis and relevant feature extraction have become a tedious task for data scientists and researchers due to the rapid creation and sharing of data. To efficiently learn from the data, it has to be pre-processed well (Brezočnik, Fister, & Podgorelec, 2018). Inconsistent and irrelevant data can mislead the machine learning model. Feature selection is a pre-processing technique that removes redundant data and selects relevant ones. The aim of a feature selection algorithm is to search for an optimal subset of features. This significantly improves the classification accuracy and reduces computational complexity of the learning model (Liu & Motoda, 1998). The search for an optimal subset is challenging task and thus is an active area of research (Mirjalili, 2018).

There are mainly three methods for feature selection: filter, wrapper and hybrid or embedded methods. Filter methods evaluate the relevance of features using statistical measures. They need a very low computation time since the selection of relevant features is independent machine learning algorithm (Dif, Belabbes, Elberrichi, & Belabbes, 2019). But, wrapper methods involve classifiers to measure the performances of different subsets of features. They make use of this performance measure as a criteria for feature selection. Wrappers are computationally expensive, but they perform better than the filter approaches (Kohavi & John, 1998). Embedded methods combine the characteristics of wrappers and filters. In the embedded approach, the feature selection algorithm is integrated as a part of the learning algorithm.

A sharp increase in the rate of data production have made it hard to try out each and every possible subsets of features for selection. The search of an optimal subset of features is thus categorized as an NP hard problem. Researchers have identified that stochastic- metaheuristic approaches will be better solutions for addressing the challenges of feature selection problem. Swarm Intelligence is a category of computational intelligence that proved their excellence in solving high complexity tasks like finding the optimum subset of features(Chakraborty & Kar, 2017). They are nature inspired algorithms which mimic the social behavior of animals. Individuals belonging to same group of animals which work together to achieve a common goal are called agents. Self-organized group of such agents are called swarm. Swarm systems have high efficiency as they possess collective intelligence. Computational models are developed corresponding to swarm systems like ant colony, bee colony, fish swarm, bats etc. They are all successful at solving issues like optimization, feature selection, image processing, business planning, bio informatics etc (Jović, Brkić, & Bogunović, 2015).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 3 Issues (2023)
Volume 13: 4 Issues (2022)
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing