Abstract
We describe and explore a new perspective on the sample complexity of active learning. In many situations where it was generally believed that active learning does not help, we show that active learning does help in the limit, often with exponential improvements in sample complexity. This contrasts with the traditional analysis of active learning problems such as non-homogeneous linear separators or depth-limited decision trees, in which Ω(1/ε) lower bounds are common. Such lower bounds should be interpreted carefully; indeed, we prove that it is always possible to learn an ε-good classifier with a number of samples asymptotically smaller than this. These new insights arise from a subtle variation on the traditional definition of sample complexity, not previously recognized in the active learning literature.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Antos, A., & Lugosi, G. (1998). Strong minimax lower bounds for learning. Machine Learning, 30, 31–56.
Balcan, M.-F., & Blum, A. (2006). A PAC-style model for learning from labeled and unlabeled data. In O. Chapelle, B. Schölkopf, & A. Zien (Eds.), Semi-supervised learning. Cambridge: MIT.
Balcan, M.-F., Beygelzimer, A., & Langford, J. (2006). Agnostic active learning. In Proceedings of the 23rd international conference on machine learning.
Balcan, M.-F., Broder, A., & Zhang, T. (2007). Margin based active learning. In Proceedings of the 20th annual conference on learning theory.
Balcan, M.-F., Hanneke, S., & Wortman, J. (2008). The true sample complexity of active learning. In Proceedings of the 21st annual conference on learning theory.
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K. (1989). Learnability and the Vapnik Chervonenkis dimension. Journal of the ACM, 36(4), 929–965.
Castro, R., & Nowak, R. (2007). Minimax bounds for active learning. In Proceedings of the 20th annual conference on learning theory.
Cohn, D., Atlas, L., & Ladner, R. (1994). Improving generalization with active learning. Machine Learning, 15(2), 201–221.
Dasgupta, S. (2004). Analysis of a greedy active learning strategy. In Advances in neural information processing systems.
Dasgupta, S. (2005). Coarse sample complexity bounds for active learning. In Advances in neural information processing systems.
Dasgupta, S., Kalai, A., & Monteleoni, C. (2005). Analysis of perceptron-based active learning. In Proceedings of the 18th annual conference on learning theory.
Dasgupta, S., Hsu, D., & Monteleoni, C. (2007). A general agnostic active learning algorithm. In Advances in neural information processing systems.
Devroye, L., Gyorfi, L., & Lugosi, G. (1996). A probabilistic theory of pattern recognition. Berlin: Springer.
Freund, Y., Seung, H. S., Shamir, E., & Tishby, N. (1997). Selective sampling using the query by committee algorithm. Machine Learning, 28(23), 133–168.
Hanneke, S. (2007a). A bound on the label complexity of agnostic active learning. In Proceedings of the 24th international conference on machine learning.
Hanneke, S. (2007b). Teaching dimension and the complexity of active learning. In Proceedings of the 20th conference on learning theory.
Hanneke, S. (2009). Theoretical foundations of active learning. PhD thesis, Machine Learning Department, Carnegie Mellon University.
Haussler, D., Littlestone, N., & Warmuth, M. (1994). Predicting {0,1}-functions on randomly drawn points. Information and Computation, 115, 248–292.
Shawe-Taylor, J., Bartlett, P. L., Williamson, R. C., & Anthony, M. (1998). Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory, 44(5), 1926–1940.
Vapnik, V. (1982). Estimation of dependencies based on empirical data. New York: Springer.
Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Sham Kakade and Ping Li.
A preliminary version of this work appeared in the Proceedings of the 21st Conference on Learning Theory, 2008 (Balcan et al. 2008).
Most of this research was done while J.W. Vaughan was at the University of Pennsylvania.
Rights and permissions
About this article
Cite this article
Balcan, MF., Hanneke, S. & Vaughan, J.W. The true sample complexity of active learning. Mach Learn 80, 111–139 (2010). https://doi.org/10.1007/s10994-010-5174-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-010-5174-y