Article Preview
TopIntroduction
Data analysis and relevant feature extraction have become a tedious task for data scientists and researchers due to the rapid creation and sharing of data. To efficiently learn from the data, it has to be pre-processed well (Brezočnik, Fister, & Podgorelec, 2018). Inconsistent and irrelevant data can mislead the machine learning model. Feature selection is a pre-processing technique that removes redundant data and selects relevant ones. The aim of a feature selection algorithm is to search for an optimal subset of features. This significantly improves the classification accuracy and reduces computational complexity of the learning model (Liu & Motoda, 1998). The search for an optimal subset is challenging task and thus is an active area of research (Mirjalili, 2018).
There are mainly three methods for feature selection: filter, wrapper and hybrid or embedded methods. Filter methods evaluate the relevance of features using statistical measures. They need a very low computation time since the selection of relevant features is independent machine learning algorithm (Dif, Belabbes, Elberrichi, & Belabbes, 2019). But, wrapper methods involve classifiers to measure the performances of different subsets of features. They make use of this performance measure as a criteria for feature selection. Wrappers are computationally expensive, but they perform better than the filter approaches (Kohavi & John, 1998). Embedded methods combine the characteristics of wrappers and filters. In the embedded approach, the feature selection algorithm is integrated as a part of the learning algorithm.
A sharp increase in the rate of data production have made it hard to try out each and every possible subsets of features for selection. The search of an optimal subset of features is thus categorized as an NP hard problem. Researchers have identified that stochastic- metaheuristic approaches will be better solutions for addressing the challenges of feature selection problem. Swarm Intelligence is a category of computational intelligence that proved their excellence in solving high complexity tasks like finding the optimum subset of features(Chakraborty & Kar, 2017). They are nature inspired algorithms which mimic the social behavior of animals. Individuals belonging to same group of animals which work together to achieve a common goal are called agents. Self-organized group of such agents are called swarm. Swarm systems have high efficiency as they possess collective intelligence. Computational models are developed corresponding to swarm systems like ant colony, bee colony, fish swarm, bats etc. They are all successful at solving issues like optimization, feature selection, image processing, business planning, bio informatics etc (Jović, Brkić, & Bogunović, 2015).