Abstract
Normal support vector machine (SVM) is not suitable for classification of large data sets because of high training complexity. Convex hull can simplify the SVM training. However, the classification accuracy becomes lower when there exist inseparable points. This paper introduces a novel method for SVM classification, called convex–concave hull SVM (CCH-SVM). After grid processing, the convex hull is used to find extreme points. Then, we use Jarvis march method to determine the concave (non-convex) hull for the inseparable points. Finally, the vertices of the convex–concave hull are applied for SVM training. The proposed CCH-SVM classifier has distinctive advantages on dealing with large data sets. We apply the proposed method on several benchmark problems. Experimental results demonstrate that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers. Compared with the other convex hull SVM methods, the classification accuracy is higher.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bennett KP, Bredensteiner EJ (2000a) Geometry in learning. In: Gorini C (eds), Geometry at work. Mathematical Association of America, pp 132–145
Bennett K.P., Bredensteiner E.J. (2000b) Duality and geometry in SVM classifiers. 17th International Conference on Machine Learning, San Francisco
Berg M, Cheong O, Kreveld M, Overmars M (2008) Computational geometry: algorithms and applications. Springer, Berlin
Cervantes J, Li X, Yu W, Li K (2008) Support vector machine classification for large data sets via minimum enclosing ball clustering. Neurocomputing 71:611–619
Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm
Collobert R, Bengio S (2001) SVMTorch: support vector machines for large regression problems. J Mach Learn Res 1:143–160
Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. 23rd international conference on machine learning, Pittsburgh, pp 201–208
Crisp DJ, Burges CJC (2000) A geometric interpretation of υ-SVM classifiers. NIPS 12:244–250
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Eddy W (1977) A new convex hull algorithm for planar sets. ACM Trans Math Softw 3(4):398–403
Franc V, Hlavac V (2003) An iterative algorithm learning the maximal margin classifier. Pattern Recogn Lett 36:1985–1996
Gilbert EG (1966) An iterative procedure for computing the minimum of a quadratic form on a convex set. SIAM J Control Optim 4(1):61–79
Graham RL (1972) An efficient algorithm for dutennining the convex hull of a finite pianar set. Inf Process Lett 1:132–133
Guo G, Zhang J-S (2007) Reducing examples to accelerate support vector regression. Pattern Recognit Lett 28:2173–2183
Ho TK, Kleinberg EM (1996) Checkerboard dataset. http://www.cs.wisc.edu/
Hsu C-W, Chang C-C, Lin C-J (2010) A practical guide to support vector classification. Bioinform Biol Insights 1(1):1–16
Jarvis RA (1973) On the identification of the convex hull of a finite set of points in the plane. Inf Process Lett 2:18–21
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, Berlin
Kallay M (1984) The complexity of incremental convex hull algorithms. Inf Process Lett 19(4):197–212
Keerthi SS, Gilbert EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46:351–360
Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) A fast iterative nearest point algorithm for support vector machine classifier design. IEEE Trans Neural Netw 11(12):124–137
Li Y (2011) Selecting training points for one-class support vector machines. Pattern Recognit Lett 32:1517–1522
Mavroforakis ME, Theodoridis S (2006) A geometric approach to support vector machine (SVM) classification. IEEE Trans Neural Netw 17(3):671–682
Mitchell BF, Dem’yanov VF, Malozemov VN (1971) Finding the point of a polyhedron closest to the origin. Vestinik Leningrad Gos Univ 13:38–45
Moreira A, Santos MY (2007) Concave hull: a K-nearest neighbours approach for the computation of the region occupied by a set of points. GRAPP (GM/R), pp 61–68
Pizzuti C, Talia D (2003) P-auto class: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641
Platt J. (1998) Fast training of support vector machine using sequential minimal optimization, advances in kernel methods: support vector machine. MIT Press, Cambridge
Preparata FP, Hong SJ (1977) convex hulls of finite sets of points in two and three dimensions. Commun ACM 20(2):87–93
Schlesinger MI, Kalmykov VG, Suchorukov AA, Sravnitelnyj (1981) Comparative analysis of algorithms synthesising linear decision rule for analysis of complex hypotheses. Automatika 1:3–9
Tsang IW, Kwok JT, Cheung PM (2005) Core vector machines: fast SVM training on very large data sets. J Mach Learn Res 6:363–392
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Xia C, Hsu W, Lee ML, Ooi BC (2006) BORDER: efficient computation of boundary points. IEEE Trans Knowl Data Eng 18(3):289–303
Yu W, Li X (2008) On-line fuzzy modeling via clustering and support vector machines. Inf Sci 178:4264–4279
Yu H, Yang J, Han J (2003) Classifying large data sets using SVMs with hierarchical clusters. Proceedings of the 9th ACM SIGKDD 2003 Washington, DC
Yuille AL, Rangarajan A (2003) The concave-convex procedure. Neural Comput Appl 15(4):915–936
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Chau, A.L., Li, X. & Yu, W. Large data sets classification using convex–concave hull and support vector machine. Soft Comput 17, 793–804 (2013). https://doi.org/10.1007/s00500-012-0954-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-012-0954-x