Abstract
Feature selection plays an important role in data preprocessing. The aim of feature selection is to recognize and remove redundant or irrelevant features. The key issue is to use as few features as possible to achieve the lowest classification error rate. This paper formulates feature selection as a multi-objective problem. In order to address feature selection problem, this paper uses the multi-objective bacterial foraging optimization algorithm to select the feature subsets and k-nearest neighbor algorithm as the evaluation algorithm. The wheel roulette mechanism is further introduced to remove duplicated features. Four information exchange mechanisms are integrated into the bacteria-inspired algorithm to avoid the individuals getting trapped into the local optima so as to achieve better results in solving high-dimensional feature selection problem. On six small datasets and ten high-dimensional datasets, comparative experiments with different conventional wrapper methods and several evolutionary algorithms demonstrate the superiority of the proposed bacteria-inspired based feature selection method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532
Caruana R, Freitag D (1994) Greedy attribute selection. In: Machine learning proceedings, pp 28–36
Chen ZJ, Wu CZ, Zhang YS, Huang Z, Ran B, Zhong M et al (2015) Feature selection with redundancy-complementariness dispersion. Knowl Based Syst 89:203–217
Chen YP, Li Y, Wang G et al (2017) A novel bacterial foraging optimization algorithm for feature selection. Expert Syst Appl 83:1–17
Chiang LH, Pell RJ (2004) Genetic algorithms combined with discriminant analysis for key variable identification. J Process Control 14(2):143–155
Choi E, Lee C (2003) Feature extraction based on the Bhattacharyya distance. Pattern Recognit 36(8):1703–1709
Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32(1):29–38
Chuang LY, Tsai SW, Yang CH (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707
Dai Q, Yao C (2017) A hierarchical and parallel branch-and-bound ensemble selection algorithm. Appl Intell 46:1–17
Dash M, Liu H, Motoda H (2000) Consistency based feature selection. Pacific-Asia conference on knowledge discovery and data mining. Springer, Berlin, pp 98–109
Deb K, Pratap A, Agarwal S et al (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B Cybern 26(1):29–41
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science. IEEE, pp 39–43
Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml
Gutlein M, Frank E, Hall M, Karwath A (2009) Large-scale attribute selection using wrappers. In: Proceeding. IEEE symposium on computational intelligence and data mining, pp 332–339
Hamdani TM, Won JM, Alimi AM, Karray F (2007) Multi-objective feature selection with NSGA II. Int Conf Adapt Natural Comput Algorithms 4431:240–247
Hsu WH (2004) Genetic wrappers for feature selection in decision tree induction and variable ordering in bayesian network structure learning. Inf Sci 163(17):103–122
Jia JH, Yang N, Zhang C, Yue AZ, Yang JY, Zhu DH (2013) Object-oriented feature selection of high spatial resolution images using an improved relief algorithm. Math Comput Model 58(3–4):619–626
Jin X, Ma EWM, Cheng LL, Pecht M (2012) Health monitoring of cooling fans based on mahalanobis distance with mrmr feature selection. IEEE Trans Instrum Meas 61(8):2222–2229
Jović A, Bogunović N (2015) A review of feature selection methods with applications. In: International convention on information communication technology, electronics and microelectronics. IEEE
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Erciyes University, Kayseri
Kashef S, Nezamabadi-Pour H (2015) An advanced ACO algorithm for feature selection. Neurocomputing 147:271–279
Kennedy J, Eberhard R (1997) A discrete binary version of the particle swarm algorithm. Proc IEEE Int Conf Syst Man Cybern Comput Cybern Simul 5:4104–4108
Khushaba RN, Al-Ani A, Al-Jumaily A (2011) Feature subset selection using differential evolution and a statistical repair mechanism. Expert Syst Appl 38(9):11515–11526
Lin SW, Lee ZJ, Chen SC, Tseng TY (2008a) Parameter determination of support vector machine and feature selection using simulated annealing approach. Appl Soft Comput 8(4):1505–1512
Lin SW, Ying KC, Chen SC, Lee ZJ (2008b) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817–1824
McNabb A, Gardner M, Seppi K (2009) An exploration of topologies and communicational in large particle swarms. In: Proceedings of the IEEE congress on evolutionary computation IEEE Press, pp 712–719
Niu B, Wang H, Wang J, Tan LJ (2013) Multi-objective bacterial foraging optimization. Neurocomputing 116:336–345
Ozturk O, Aksac A, Elsheikh A, Ozyer T, Alhajj R (2013) A consistency-based feature selection method allied with linear SVMs for HIV-1 protease cleavage site prediction. PLoS ONE 8(8):e63145
Park CH, Kim SB (2015) Sequential random k-nearest neighbor feature selection for high-dimensional data. Expert Syst Appl 42(5):2336–2342
Passino KM (2002) Biomimicry of bacterial foraging for distributed optimization and control. IEEE Control Syst 22(3):52–67
Peng HC, Long FH, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238
Wang H, Niu B (2017) A novel bacterial algorithm with randomness control for feature selection in classification. Elsevier, Amsterdam
Wang HS, Yan XF (2015) Optimizing the echo state network with a binary particle swarm optimization algorithm. Knowl Based Syst 86:182–193
Wang G, Ma J, Yang SL (2011) IGF-bagging: information gain based feature selection for bagging. Int J Innov Comput Inf Control 7(11):6247–6259
Wang H, Jing X, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Elsevier, Amsterdam
Xue B, Zhang M, Browne WN (2012) New fitness functions in binary particle swarm optimisation for feature selection. In: Evolutionary computation (CEC). 2012 IEEE Congress
Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671
Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276
Xue B, Zhang M, Browne W, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 20(4):606–626
Yang CH, Chuang LY, Yang CH (2010) IG-GA: a hybrid filter/wrapper method for feature selection of microarray data. J Med Biol Eng 30(1):23–28
Zhao Z, Liu H (2009) Searching for interacting features in subset selection. IOS Press 13(2):207–228
Zhu Z, Ong YS, Markov DM (2007) Blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40(11):3236–3248
Acknowledgements
This work is partially supported by The National Natural Science Foundation of China (Grants Nos. 71571120, 71271140, 71471158, 71001072, and 61472257). Natural Science Foundation of Guangdong Province (2016A030310074, 2018A030310575), Shenzhen Science and Technology Plan (CXZZ20140418182638764), Research Foundation of Shenzhen University (85303/00000155), and Research Cultivation Project from Shenzhen Institute of Information Technology (ZY201717).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Niu, B., Yi, W., Tan, L. et al. A multi-objective feature selection method based on bacterial foraging optimization. Nat Comput 20, 63–76 (2021). https://doi.org/10.1007/s11047-019-09754-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-019-09754-6