Abstract
In this paper we show that several known algorithms for sequential prediction problems (including the quasi-additive family of Grove et al. and Littlestone and Warmuth’s Weighted Majority), for playing iterated games (including Freund and Schapire’s Hedge and MW, as well as the Λ-strategies of Hart and Mas-Colell), and for boosting (including AdaBoost) are special cases of a general decision strategy based on the notion of potential. By analyzing this strategy we derive known performance bounds, as well as new bounds, as simple corollaries of a single general theorem. Besides offering a new and unified view on a large family of algorithms, we establish a connection between potential-based analysis in learning and their counterparts independently developed in game theory. By exploiting this connection, we show that certain learning problems are instances of more general game-theoretic problems. In particular, we describe a notion of generalized regret and show its applications in learning theory.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
D. Blackwell. An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics, 6:1–8, 1956.
H.D. Block. The Perceptron: a model for brain functioning. Review of Modern Physics, 34:123–135, 1962.
N. Cesa-Bianchi. Analysis of two gradient-based algorithms for on-line regression. Journal of Computer and System Sciences, 59(3):392–411, 1999.
N. Cesa-Bianchi, Y. Freund, D.P. Helmbold, D. Haussler, R. Schapire, and M.K. Warmuth. How to use expert advice. Journal of the ACM, 44(3):427–485, 1997.
T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley, New York, 1991.
D. Foster and R. Vohra. Calibrated learning and correlated equilibrium. Games and Economic Behaviour, 21:40–55, 1997.
D. Foster and R. Vohra. Regret in the on-line decision problem. Games and Economic Behavior, 29:7–36, 1999.
Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, 1997.
Y. Freund and R. Schapire. Large margin classification using the Perceptron algorithm. Machine Learning, 22:277–296, 1999.
Y. Freund and R. Schapire. Adaptive game playing using multiplicative weights. Games and Economic Behavior, 29:79, 1999.
Y. Freund, R. Schapire, Y. Singer, and M. Warmuth. Using and combining predictors that specialize. In Proceedings of the 29th Annual ACM Symposium on the Theory of Computing, page 334–343. ACM Press, 1997.
D. Fudenberg and D. Levine. Conditional universal consistency. Games and Economic Behaviour, 29:7–35, 1999.
C. Gentile The robustness of the p-norm algorithms. Manuscript, 2001. An extended abstract (co-authored with N. Littlestone) appeared in the Proceedings of the 12th Annual Conference on Computational Learning Theory, pages 1–11. ACM Press, 1999.
C. Gentile and M. Warmuth. Linear hinge loss and average margin. In Advances in Neural Information Processing Systems 11, pages 225–231, MIT Press, 1998.
A.J. Grove, N. Littlestone, and D. Schuurmans. General convergence results for linear discriminant updates. In Proceedings of the 10th Annual Conference on Computational Learning Theory, pages 171–183. ACM Press, 1997.
J. Hannan. Approximation to Bayes risk in repeated play. Contributions to the theory of games, 3:97–139, 1957.
D. Haussler, J. Kivinen, and M.K. Warmuth. Sequential prediction of individual sequences under general loss functions. IEEE Transactions on Information Theory, 44:1906–1925, 1998.
S. Hart and A. Mas-Colell. A general class of adaptive strategies. Journal of Economic Theory, to appear, 2000.
S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68:1127–1150, 2000.
J. Kivinen and M.K. Warmuth. Averaging expert predictions. In Proceedings of the Fourth European Conference on Computational Learning Theory, pages 153–167. Lecture Notes in Artificial Intelligence 1572. Springer, Berlin, 1999.
E. Lehrer. A wide range no-regret theorem. Unpublished manuscript, 2000.
N. Littlestone and M.K. Warmuth. The weighted majority algorithm. Information and Computation, 108:212–261, 1994.
N. Littlestone. Mistake bounds and linear-threshold learning algorithms. PhD Thesis, University of California, Santa Cruz, 1989. Technical Report UCSC-CRL-89-11.
A.B.J. Novikov. On convergence proofs on Perceptrons. In Proceedings of the Symposium of the Mathematical Theory of Automata, volume XII, pages 615–622, 1962.
F. Rosenblatt. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Washington, DC, 1962.
R. Schapire. Drifting games. Machine Learning, 2001. To appear.
E. Seneta. Non-negative Matrices and Markov Chains. Springer, New York, 1981.
V.G. Vovk. Aggregating strategies. In Proceedings of the Third Annual Workshop on Computational Learning Theory, pages 372–383. Association of Computing Machinery, New York, 1990.
V.G. Vovk. A game of prediction with expert advice. In Journal of Computer and System Sciences, 56(2):153–173, 1998.
V.G. Vovk Competitive on-line statistics. International Statistical Review, 2001. To appear.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cesa-Bianchi, N., Lugosi, G. (2001). Potential-Based Algorithms in Online Prediction and Game Theory. In: Helmbold, D., Williamson, B. (eds) Computational Learning Theory. COLT 2001. Lecture Notes in Computer Science(), vol 2111. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44581-1_4
Download citation
DOI: https://doi.org/10.1007/3-540-44581-1_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42343-0
Online ISBN: 978-3-540-44581-4
eBook Packages: Springer Book Archive