Metaheuristic Ensemble Pruning via Greedy-Based Optimization Selection

Metaheuristic Ensemble Pruning via Greedy-Based Optimization Selection

Mergani Ahmed Eltahir Khairalla
Copyright: © 2022 |Pages: 22
DOI: 10.4018/IJAMC.292501
Article PDF Download
Open access articles are freely available for download

Abstract

Ensemble selection is a crucial problem for ensemble learning (EL) to speed up the predictive model, reduce the storage space requirements and to further improve prediction accuracy. Diversity among individual predictors is widely recognized as a key factor to successful ensemble selection (ES), while the ultimate goal of ES is to improve its predictive accuracy and generalization of the ensemble. Motivated by the problems stated in previous, we have devised a novel hybrid layered based greedy ensemble reduction (HLGER) architecture to delete the predictor with lowest accuracy and diversity with evaluation function according to the diversity metrics. Experimental investigations are conducted based on benchmark time series data sets, support vectors regression algorithm utilized as base learner to generate homogeneous ensemble, HLGER uses locally weight ensemble (LWE) strategies to provide a final ensemble prediction. The experimental results demonstrate that, in comparison with benchmark ensemble pruning techniques, HLGER achieves significantly superior generalization performance.
Article Preview
Top

1. Introduction

Ensemble learning (EL) has been successfully used as a desirable learning scheme for many regression and classification problems because of its potential to greatly increase predictive accuracy (Idris, Khan, & Lee, 2013; Nadig, Potter, Hoogenboom, & Mcclendon, 2013; P. Sun & Lee, 2014). An ensemble refers to a group of base learners whose decisions are aggregated with the goal of achieving better performance than its constituent members (Caruana, Munson, & Niculescu-Mizil, 2006). Typically, EL algorithm consists of two main stages: first, the production of multiple base classifiers for one specific task, and then the combination of these classifiers to get a final predictive decision (Banfield, Hall, Bowyer, & Kegelmeyer, 2005; Li, Yu, & Zhou, 2012; Tsoumakas, Partalas, & Vlahavas, 2009).

However, it is obvious that combining all of the classifiers in an ensemble adds a lot of computational overheads (Tsoumakas et al., 2009) (Polikar, 2009). Both theoretical and empirical studies have shown that instead of using the whole ensemble, a subset of the ensemble can achieve equivalent or even better generalization performance (Lu, Wu, Zhu, & Bongard, 2010; Y. Zhang, Burer, & Street, 2006). Therefore, an additional intermediate stage that deals with the selection of an appropriate sub-ensemble prior to its combination has to be considered (Dai, 2013a; Dai & Liu, 2013). This stage is generally termed as ensemble pruning (EP) (Lu et al., 2010), selective ensemble (C. X. Zhang & Zhang, 2011), ensemble selection(Maskouni, Hosseini, Abachi, Kangavari, & Zhou, 2018), or ensemble thinning (Banfield et al., 2005).

The problem of pruning an ensemble of classifiers has been proven to be NP-complete (Tamon & Xiang, 2000). Finding the best sub-ensemble through exhaustive searching is not feasible for the original ensemble with large or even moderate size (Partalas, Tsoumakas, & Vlahavas, 2008).

EP can be further categorized into two types, namely static pruning (SP) and dynamic pruning (DP) methods (Soto, García-Moratilla, Martínez-Muñoz, Hernández-Lobato, & Suárez, 2017). In SP methods, a fixed set of predictors from an initial pool is selected from the ensemble for all test patterns, while DP methods predictors are selected based on the test pattern. Of the SP methods, search-based (Jiang, Liu, Fu, & Wu, 2017), clustering-based (Fan, Tao, Zhou, & Han, 2017), optimization-based (Lessmann, Caserta, & Arango, 2011), ordered aggregation methods (L. Guo & Boukir, 2013), and other methods exist that have a combination of these categories or use elaborate pruning methods are the most commonly used (C. X. Zhang & Zhang, 2011).

The greedy algorithms (Hernández-Lobato, Martínez-Muñoz, & Suárez, 2011) (Partalas, Tsoumakas, & Vlahavas, 2012) (Martínez-Muñoz, Hernández-Lobato, & Suárez, 2008), also known as hill climbing algorithms, which reduce the search space appropriately, seem to be a good choice. Various greedy ensemble selecting (GES) algorithms have been proposed, just as in Refs. (Banfield et al., 2005) (Martínez-Muñoz & Suárez, 2004) (Abdesslem, Marwa, & Maroua Bencheikh, 2013) (Dai, 2013b) (Gevezes & Pitsoulis, 2015) (Pérez-Gállego, Castaño, Quevedo, & Coz, 2018) (Partalas, Tsoumakas, & Vlahavas, 2010).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing