Abstract
The PAC-learning model is distribution-independent in the sense that the learner must reach a learning goal with a limited number of labeled random examples without any prior knowledge of the underlying domain distribution. In order to achieve this, one needs generalization error bounds that are valid uniformly for every domain distribution. These bounds are (almost) tight in the sense that there is a domain distribution which does not admit a generalization error being significantly smaller than the general bound. Note however that this leaves open the possibility to achieve the learning goal faster if the underlying distribution is “simple”. Informally speaking, we say a PAC-learner L is “smart” if, for a “vast majority” of domain distributions D, L does not require significantly more examples to reach the “learning goal” than the best learner whose strategy is specialized to D. In this paper, focusing on sample complexity and ignoring computational issues, we show that smart learners do exist. This implies (at least from an information-theoretical perspective) that full prior knowledge of the domain distribution (or access to a huge collection of unlabeled examples) does (for a vast majority of domain distributions) not significantly reduce the number of labeled examples required to achieve the learning goal.
This work was supported by the Deutsche Forschungsgemeinschaft Grant SI 498/8-1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ben-David, S., Lu, T., Pál, D.: Does unlabeled data provably help? Worst-case analysis of the sample complexity of semi-supervised learning. In: Proceedings of the 21st Annual Conference on Learning Theory, pp. 33–44 (2008)
Benedek, G.M., Itai, A.: Learnability with respect to fixed distributions. Theoretical Computer Science 86(2), 377–389 (1991)
Blumer, A., Ehrenfeucht, A., Haussler, D., Warmuth, M.K.: Learnability and the Vapnik-Chervonenkis dimension. Journal of the Association on Computing Machinery 36(4), 929–965 (1989)
Ehrenfeucht, A., Haussler, D., Kearns, M., Valiant, L.: A general lower bound on the number of examples needed for learning. Information and Computation 82(3), 247–261 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Simon, H.U. (2009). Smart PAC-Learners. In: Gavaldà, R., Lugosi, G., Zeugmann, T., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2009. Lecture Notes in Computer Science(), vol 5809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04414-4_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-04414-4_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04413-7
Online ISBN: 978-3-642-04414-4
eBook Packages: Computer ScienceComputer Science (R0)