Abstract
Telecom customer churn data is not publicly available because involving users’ personal privacy. In 2009, the French telecommunications company Orange for knowledge discovery and data mining (KDD) competition provides a telecom customer churn data set KDD Cup 09. In order to solve the high dimensional problem of KDD Cup 09, a new feature reduction method is used to explore the influence of different features on the prediction of classification model. In this paper, a new K- local maximum margin feature extraction algorithm (KLMM) is proposed. Through researching on the diversification subspace partition rules, the corresponding potential field structure is constructed. According to the data source in the dimension of scalability, the intrinsic link between data attributes and classification results is revealed. The extracted features can reduce the dimension of the churn prediction in telecom data. The KLMM method adapts auto selection sigma factor to reflect the anisotropy of features. The potential function is used to assess the weights of attributes and find the potential important weight. Experiments and analysis show that the extracted features by KLMM are more likely to find a classification hyperplane which can separate data points of the different classes.
Similar content being viewed by others
References
Xu, H., Zhang, Z., Zhang, Y.: Churn prediction in telecom using a hybrid two-phase feature selection method[C] international symposium on intelligent information technology application. 576–579 (2009)
Idris, A., Khan, A., Lee, Y.S.: Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification. Appl. Intell. 39(3), 659–672 (2013)
Fathian, M., Hoseinpoor, Y., Minaei-Bidgoli, B.: Offering a hybrid approach of data mining to predict the customer churn based on bagging and boosting methods. Kybernetes 45(5), 732–743 (2016)
Idris, A., Khan, A., Lee, Y.S.: Intelligent churn prediction in telecom: employing mrmr feature selection and rotboost based ensemble classification. Appl. Intell. 39(3), 659–672 (2013)
Xiao, J., Jiang, X., He, C., Teng, G.: Churn prediction in customer relationship management via gmdh-based multiple classifiers ensemble. IEEE Intell. Syst. 31(2), 37–44 (2016)
Yang, B., Xu, J., Yang, J., Li, M.: Localization algorithm in wireless sensor networks based on semi-supervised manifold learning and its application. Clus. Comput. 13(4), 435–446 (2010)
Mirebeau, J.M.: Anisotropic fast-marching on cartesian grids using lattice basis reduction. Siam J. Numer. Anal. 52, 1573–1599 (2014)
Daniel, S.F., Connolly, A., Schneider, J., Vanderplas, J., Xiong, L.: Classification of stellar spectra with local linear embedding. Astron. J. 142(6), 557–561 (2011)
Irion, J., Saito, N.: Hierarchical graph laplacian eigen transforms. Jsiam Lett. 6, 21–24 (2014)
Li, B., Zheng, C.H., Huang, D.S.: Locally linear discriminant embedding: an efficient method for face recognition. Pattern Recogn. 41(12), 3813–3821 (2008)
Li, J.B., Pan, J.S., Chu, S.C.: Kernel class-wise locality preserving projection. Inf. Sci. 178(7), 1825–1835 (2008)
Monge, D.A., Holec, M., Železný, F., Garino, C.G.: Ensemble learning of runtime prediction models for gene-expression analysis workflows. Clus. Comput. 18(4), 1317–1329 (2015)
Kwak, N.: Nonlinear projection trick in kernel methods: an alternative to the kernel trick. IEEE Trans. Neural Netw. Learn. Syst. 24(12), 2113 (2013)
Jang, J., Lee, Y., Lee, S., Shin, S., Kim, D., Rim, H.: A novel density-based clustering method using word embedding features for dialogue intention recognition. Clust. Comput. 19, 2315–2326 (2016)
Yang, J., Zhang, L., Yang, J.Y., Zhang, D.: From classifiers to discriminators: a nearest neighbor rule induced discriminant analysis. Pattern Recogn. 44(7), 1387–1402 (2011)
Villegas, M., Paredes, R.: Dimensionality reduction by minimizing nearest-neighbor classification error. Pattern Recogn. Lett. 32(4), 633–639 (2011)
Guyon, I., Lemaire, V., Dror, G., Vogel, D.: Design and analysis of the kdd cup 2009: fast scoring on a large orange customer database. ACM Sigkdd Explor. Newslett. 11(2), 68–76 (2010)
Rodan, A., Faris, H., Al-Sakran, J., Al-Kadi, O.: A support vector machine approach for churn prediction in telecom industry. Int. J. Inf. 17(8), 3961 (2014)
Li, D., Wang, S., Gan, W., Li, D.: Data field for hierarchical clustering. Int. J. Data Warehous. Min. 7(4), 43–63 (2011)
Li, C., Liu, Q., Dong, W., Wei, F., Zhang, X., Yang, L.: Max-margin-based discriminative feature learning. IEEE Trans. Neural Netw. Learning Syst. 27(12), 2768–2775 (2016)
Yong-Zhi, L.I., Yang, J.Y., Zheng, Y.J., Xia, Y.Q.: New and efficient feature extraction methods based on maximum margin criterion. J. Syst. Simul. 19(5), 1061–1066 (2007)
Sang, Y.O., Chung, K.: Vocabulary optimization process using similar phoneme recognition and feature extraction. Clust. Comput. 19, 1683–1690 (2016)
Zhu, Q., Feng, J., Huang, J.: Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter. Clust. Comput. 19(3), 1–13 (2016)
Yang, H.H., Moody, J.: Data visualization and feature selection: new algorithms for nongaussian data. Adv. Neural Inf. Process. Syst. 12, 687–693 (2000)
Meyer, P.E., Bontempi, G.: On the Use of Variable Complementarity for Feature Selection in Cancer Classification. Applications of Evolutionary Computing, Springer (2006)
Lin, D., Tang, X.: (2006). Conditional Infomax Learning: an integrated framework for feature extraction and fusion. Computer vision - ECCV 2006, European Conference on Computer Vision, Graz, Austria, Proceedings vol. 3951, pp. 68–82. May 7–13 2006
Bratko, I.: (2005). Machine learning based on attribute interactions: PhD dissertation
Cheng, H., Qin, Z., Feng, C., Wang, Y., Li, F.: Conditional mutual information-based feature selection analyzing for synergy and redundancy. Etri J. 33(2), 210–218 (2011)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (71271125, 61502260) and Natural Science Foundation of Shandong Province, China (ZR2011FM028).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, L., Gao, Q., Dong, X. et al. K- local maximum margin feature extraction algorithm for churn prediction in telecom. Cluster Comput 20, 1401–1409 (2017). https://doi.org/10.1007/s10586-017-0843-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0843-2