Article Preview
TopIntroduction
Data mining is the process of finding meaningful knowledge, and models from large volumes of data. The learning task becomes more difficult for multidimensional datasets. Thus, the generated model cannot handle these volumes rapidly and effectively (Singh et al. 2015). Moreover, the datasets in their initial state contain noisy and redundant data, which harms the classification task and reduces performances. Hence, preprocessing is required to deal with these problems and to ensure the quality of the generated model.
The feature selection approaches can be categorized into four classes: wrappers (Mafarja et al., 2018), filters (Cai et al., 2018), embedded (Lu et al., 2019) and hybrid methods (Venkatesh et al,. 2019). Wrapper methods are based on the learning algorithm in their optimization process to generate robust subsets. Despite their efficiency, these methods suffer from high computational complexity. On the other hand, Filters set out to select the appropriate subsets of features independently to the learning algorithm, which reduces their run time complexity and accuracy (information gain: IG, minimum redundancy maximum relevance: MRMR, Correlation feature selection: CFS). Thirdly, embedded methods incorporate the feature selection process into the learning task. Finally, hybrid methods combine both filters and wrappers to take advantage of these two methods.
Feature selection methods start by generating subsets of features. Then, these subsets are evaluated to select the best subset. The generation process is known as a combinatorial problem due to the exponential number of combinations between features, which can cause a high computational complexity. To reduce this complexity, optimization methods such as heuristics and metaheuristics have been largely used (Voß et al. 2012).
Metaheuristics are optimization methods that guide the search to find good solutions in a reasonable run time complexity. Although, they do not guarantee to find the best global solution as exact methods. Several types of metaheuristics have been proposed in the literature: genetic algorithms (GA) take its inspiration from the theory of evolution, the particle swarm optimization algorithm (PSO) (Eberhart et al.1995) simulates the behavior of groups of animals, the multi-verse optimizer (MVO) (Mirjalili et al.2016) is inspired from the multi-verse theory, bat algorithm (Yang et al. 2010) simulates the behavior of bats in their communication, firefly algorithm (FA) (Yang et al. 2009) mimics the behavior of fireflies in their communication process via flashing lights and the lion optimization algorithm (LOA) is inspired by the behavior of lions in their communication (Yazdani et al.2016).
The bio-inspired computation takes its inspiration from two main concepts: evolutionary (GA, DE,…) and swarm intelligence algorithms (BAT, PSO,...) (Fister et al. 2013). The first group aims mainly to explore the search space based on evolutionary operations: selection, crossover, and mutation. Whereas, the second group is based on explicit cooperation and communication between different individuals by using both neighborhood and memory concepts.
Despite the advantages of metaheuristics, several drawbacks are related to the exploration process such as convergence for high dimensional problems, and exploitation such as: getting trapped into a local minimum or optimum and premature convergence. Moreover, it is difficult to find a good metaheuristic that ensures suitable cooperation between exploitation and exploration for all machine learning applications. To solve this problem, several attempts have been made to propose enhanced metaheuristics. For instance, (Vieira et al.2013) proposed to introduce the local search and mutation to PSO to avoid premature convergence. In another investigation, (DIF et al., 2017) suggested introducing the refinement process into the MVO metaheuristics to improve the exploitation. Other solutions propose the hybridization between evolutionary and swarm intelligence algorithms to take advantage of both exploration and exploitation and to create cooperation between these two concepts (Gandelli et al. 2007). Whereas, there is no guarantee on the efficiency of the hybrid method compared to its components as the theoretical hypothesis suggests (DIF et al. 2018). According to our knowledge, all the previous proposed hybridizations have a static behavior, where the hybrid method proceeds similarly for all treated tasks. However, it is challenging to select suitable components for all optimization problems because of the stochastic nature of metaheuristics.