Potential-Based Algorithms in On-Line Prediction and Game Theory

Cesa-Bianchi, Nicolò; Lugosi, Gábor

doi:10.1023/A:1022901500417

Potential-Based Algorithms in On-Line Prediction and Game Theory

Published: June 2003

Volume 51, pages 239–261, (2003)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Potential-Based Algorithms in On-Line Prediction and Game Theory

Download PDF

Nicolò Cesa-Bianchi¹ &
Gábor Lugosi²

1391 Accesses
70 Citations
Explore all metrics

Abstract

In this paper we show that several known algorithms for sequential prediction problems (including Weighted Majority and the quasi-additive family of Grove, Littlestone, and Schuurmans), for playing iterated games (including Freund and Schapire's Hedge and MW, as well as the Λ-strategies of Hart and Mas-Colell), and for boosting (including AdaBoost) are special cases of a general decision strategy based on the notion of potential. By analyzing this strategy we derive known performance bounds, as well as new bounds, as simple corollaries of a single general theorem. Besides offering a new and unified view on a large family of algorithms, we establish a connection between potential-based analysis in learning and their counterparts independently developed in game theory. By exploiting this connection, we show that certain learning problems are instances of more general game-theoretic problems. In particular, we describe a notion of generalized regret andshow its applications in learning theory.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Auer, P., Cesa-Bianchi, N., & Gentile, C. (2002). Adaptive and self-confident on-line learning algorithms. Journal of Computer and System Sciences, 64:1, 48–75.
Google Scholar
Blackwell, D. (1956). An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics, 6, 1–8.
Google Scholar
Block, H. (1962). The perceptron: A model for brain functioning. Review of Modern Physics, 34, 123–135.
Google Scholar
Bregman, L. (1967). The relaxation method of finding the common point of convex sets and its application to the solutions of problems in convex programming. USSR Computational Mathematics and Physics, 7, 200–217.
Google Scholar
Cesa-Bianchi, N. (1999). Analysis of two gradient-based algorithms for on-line regression. Journal of Computer and System Sciences, 59:3, 392–411.
Google Scholar
Cesa-Bianchi, N., Freund, Y., Helmbold, D., Haussler, D., Schapire, R., & Warmuth, M. (1997). How to use expert advice. Journal of the ACM, 44:3, 427–485.
Google Scholar
Cesa-Bianchi, N., Conconi, A., & Gentile, C. (2002). A second-order perceptron algorithm. In Proceedings of the 15th Annual Conference on Computational Learning Theory. Lecture Notes in Artificial Intelligence, Springer, 2002.
Cohen,W., & Singer, Y. (1999). Context-sensitive learning methods for text categorization. ACM Transactions on Information Systems, 17:2, 141–173.
Google Scholar
Cover, T., & Thomas, J. (1991). Elements of Information Theory. John Wiley and Sons.
Foster, D., & Vohra, R. (1997). Calibrated learning and correlated equilibrium. Games and Economic Behaviour, 21, 40–55.
Google Scholar
Foster, D., & Vohra, R. (1999). Regret in the on-line decision problem. Games and Economic Behaviour, 29:1/2, 7–35.
Google Scholar
Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119–139.
Google Scholar
Freund, Y., & Schapire, R. (1999a). Adaptive game playing using multiplicative weights. Games and Economic Behavior, 29, 79–103.
Google Scholar
Freund,Y., & Schapire, R. (1999b). Large margin classification using the perceptron algorithm. Machine Learning, 37:3, 277–296.
Google Scholar
Freund, Y., Schapire, R., Singer, Y., & Warmuth, M. (1997). Using and combining predictors that specialize. In Proceedings of the 29th Annual ACM Symposium on the Theory of Computing (pp. 334–343). ACM Press.
Fudenberg, D., & Levine, D. (1995). Universal consistency and cautious fictitious play. Journal of Economic, Dynamics, and Control, 19, 1065–1090.
Google Scholar
Fudenberg, D., & Levine, D. (1999). Conditional universal consistency. Games and Economic Behaviour, 29, 7–35.
Google Scholar
Gentile, C. (2001). The robustness of the p-norm algorithms.An extended abstract (co-authored with N. Littlestone) appeared in the Proceedings of the 12th Annual Conference on Computational Learning Theory (pp. 1–11). ACM Press, 1999.
Gentile, C., & Warmuth, M. (1999). Linear hinge loss and average margin. In Advances in Neural Information Processing Systems 10. MIT Press.
Grove, A., Littlestone, N., & Schuurmans, D. (1997). General convergence results for linear discriminant updates. In Proceedings of the 10th Annual Conference on Computational Learning Theory (pp. 171–183). ACM Press.
Grove, A., Littlestone, N., & Schuurmans, D. (2001). General convergence results for linear discriminant updates. Machine Learning, 43:3, 173–210.
Google Scholar
Hannan, J. (1957). Approximation to bayes risk in repeated play. Contributions to the Theory of Games, 3, 97–139.
Google Scholar
Hart, S., & Mas-Colell, A. (2000). A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68, 1127–1150.
Google Scholar
Hart, S., & Mas-Colell, A. (2001). A general class of adaptive strategies. Journal of Economic Theory, 98:1, 26–54.
Google Scholar
Haussler, D., Kivinen, J., & Warmuth, M. (1998). Sequential prediction of individual sequences under general loss functions. IEEE Transactions on Information Theory, 44, 1906–1925.
Google Scholar
Kivinen, J., & Warmuth, M. (1999). Averaging expert predictions. In Proceedings of the Fourth European Conference on Computational Learning Theory (pp. 153–167). Lecture Notes in Artificial Intelligence, vol. 1572. Springer.
Kivinen, J., & Warmuth, M. (2001). Relative loss bounds for multidimensional regression problems. Machine Learning, 45:3, 301–329.
Google Scholar
Lehrer, E. (2001). A wide range no-regret theorem. Games and Economic Behavior, in press.
Littlestone, N. (1989). Mistake bounds and logarithmic linear-threshold learning algorithms. Ph.D. Thesis, University of California at Santa Cruz.
Littlestone, N., & Warmuth, M. (1994). The weighted majority algorithm. Information and Computation, 108, 212–261.
Google Scholar
Novikoff, A. (1962). On convergence proofs of perceptrons. In Proceedings of the Symposium on the Mathematical Theory of Automata, vol. XII. pp. 615–622.
Google Scholar
Rosenblatt, F. (1962). Principles of Neurodynamics. Spartan Books.
Schapire, R. (2001). Drifting games. Machine Learning, 43:5, 265–291.
Google Scholar
Seneta, E. (1981). Non-Negative Matrices and Markov Chains. Springer.
Vovk, V. (1990). Aggregating strategies. In Proceedings of the 3rd Annual Workshop on Computational Learning Theory. pp. 372–383.
Vovk, V. (1998). A game of prediction with expert advice. Journal of Computer and System Sciences, 56:2, 153–173.
Google Scholar
Vovk, V. (2001). Competitive on-line statistics. International Statistical Review, 69, 213–248.
Google Scholar
Warmuth, M., & Jagota, A. (1997). Continuous and discrete-time nonlinear gradient descent: Relative loss bounds and convergence. In Electronic Proceedings of the 5th International Symposium on Artificial Intelligence and Mathematics.

Download references

Author information

Authors and Affiliations

Department of Information Technologies, University of Milan, Via Bramante 65, 26013, Crema, Italy
Nicolò Cesa-Bianchi
Department of Economics, Pompeu Fabra University, Ramon Trias Fargas 25-27, 08005, Barcelona, Spain
Gábor Lugosi

Authors

Nicolò Cesa-Bianchi
View author publications
You can also search for this author in PubMed Google Scholar
Gábor Lugosi
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cesa-Bianchi, N., Lugosi, G. Potential-Based Algorithms in On-Line Prediction and Game Theory. Machine Learning 51, 239–261 (2003). https://doi.org/10.1023/A:1022901500417

Download citation

Issue Date: June 2003
DOI: https://doi.org/10.1023/A:1022901500417

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Potential-Based Algorithms in On-Line Prediction and Game Theory

Abstract

Article PDF

Similar content being viewed by others

Asymptotically Optimal Strategies for Online Prediction with History-Dependent Experts

No-regret dynamics in the Fenchel game: a unified framework for algorithmic convex optimization

Game Theory and Its Applications in Machine Learning

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Potential-Based Algorithms in On-Line Prediction and Game Theory

Abstract

Article PDF

Similar content being viewed by others

Asymptotically Optimal Strategies for Online Prediction with History-Dependent Experts

No-regret dynamics in the Fenchel game: a unified framework for algorithmic convex optimization

Game Theory and Its Applications in Machine Learning

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation