Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

Cohen, Johanne; Héliou, Amélie; Mertikopoulos, Panayotis

doi:10.1007/978-3-319-66700-3_20

Johanne Cohen¹⁷,
Amélie Héliou^18,19 &
Panayotis Mertikopoulos²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10504))

Included in the following conference series:

International Symposium on Algorithmic Game Theory

1086 Accesses
6 Citations

Abstract

This paper examines the problem of multi-agent learning in \(N\)-person non-cooperative games. For concreteness, we focus on the so-called “hedge” variant of the (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that (a) dominated strategies become extinct (a.s.); and (b) in generic games, pure Nash equilibria are attracting with high probability, even in the presence of uncertainty and noise of arbitrarily high variance. Moreover, if the algorithm’s step-size does not decay too fast, we show that these properties occur at a quasi-exponential rate – that is, much faster than the algorithm’s \({{\mathrm{\mathcal O}}}(1/\sqrt{T})\) worst-case regret guarantee would suggest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games

Article 25 July 2019

A Brief Overview of Interdiction and Robust Optimization

Learning a Correlated Equilibrium with Perturbed Regret Minimization

Notes

1.
As the second name suggests, this set contains the game’s set of correlated equilibria (CE) – and hence, the game’s set of Nash equilibria as well.
2.
In the above \((\alpha _{i};\alpha _{-i})\) is shorthand for \((\alpha _{1},\ldots ,\alpha _{i},\ldots ,\alpha _{N})\), used here to highlight the action of player \(i\) against that of all other players.
3.
Note here that (10a) is phrased in terms of the players’ mixed strategy profile \(x(t)\), not the action profile \(\alpha (t) = (\alpha _{i}(t);\alpha _{-i}(t))\) which is chosen based on \(x(t)\) at stage \(t\). To see that (H1) indeed implies (10a), simply recall that \(x_{i}(t) = {{\mathrm{\Lambda }}}_{i}(y_{i}(t-1))\) so
.
4.
To the best of our knowledge, the closest result is the elimination of dominated strategies under exponential learning in continuous time [20]; for a survey of the relevant literature, see [25].
5.
Recall also the counterexample of [26] discussed in the introduction.
6.
This can be shown by an induction argument on the rounds of elimination of dominated strategies as in [18].

References

Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)
Article MathSciNet Google Scholar
Blum, A., Hajiaghayi, M.T., Ligett, K., Roth, A.: Regret minimization and the price of total anarchy. In: STOC 2008: Proceedings of the 40th Annual ACM Symposium on the Theory of Computing, pp. 373–382. ACM (2008)
Google Scholar
Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria (Chap. 4). In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds.) Algorithmic Game Theory. Cambridge University Press, Cambridge (2007)
Google Scholar
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)
Article Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)
Book Google Scholar
Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40(3), 611–633 (2015)
Article MathSciNet Google Scholar
Foster, D., Vohra, R.V.: Calibrated learning and correlated equilibrium. Games Econ. Behav. 21(1), 40–55 (1997)
Article MathSciNet Google Scholar
Foster, D.J., Lykouris, T., Sridharan, K., Tardos, E.: Learning in games: robustness of fast convergence. In: Advances in Neural Information Processing Systems, pp. 4727–4735 (2016)
Google Scholar
Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)
Article MathSciNet Google Scholar
Goldberg, P.W., Roth, A.: Bounds for the query complexity of approximate equilibria. ACM Trans. Econ. Comput. 4(4), 24:1–24:25 (2016)
Article MathSciNet Google Scholar
Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. Academic Press, New York (1980)
MATH Google Scholar
Hannan, J.: Approximation to Bayes risk in repeated play. In: Dresher, M., Tucker, A.W., Wolfe, P. (eds.) Contributions to the Theory of Games. Annals of Mathematics Studies, vol. 39, pp. 97–139. Princeton University Press, Princeton (1957)
MATH Google Scholar
Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)
Article MathSciNet Google Scholar
Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3), 291–307 (2005)
Article MathSciNet Google Scholar
Kleinberg, R., Piliouras, G., Tardos, É.: Load balancing without regret in the bulletin board model. Distrib. Comput. 24(1), 21–29 (2011)
Article Google Scholar
Krichene, W., Drighès, B., Bayen, A.M.: Learning Nash equilibria in congestion games. arXiv preprint arXiv:1408.0017 (2014)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet Google Scholar
Laraki, R., Mertikopoulos, P.: Higher order game dynamics. J. Econ. Theory 148(6), 2666–2695 (2013)
Article MathSciNet Google Scholar
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
Article MathSciNet Google Scholar
Mertikopoulos, P., Moustakas, A.L.: The emergence of rational behavior in the presence of stochastic perturbations. Ann. Appl. Probab. 20(4), 1359–1388 (2010)
Article MathSciNet Google Scholar
Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41(4), 1297–1324 (2016)
Article MathSciNet Google Scholar
Roughgarden, T.: Intrinsic robustness of the price of anarchy. J. ACM (JACM) 62(5), 32 (2015)
Article MathSciNet Google Scholar
Sandholm, W.H.: Population Games and Evolutionary Dynamics. Economic Learning and Social Evolution. MIT Press, Cambridge (2010)
MATH Google Scholar
Syrgkanis, V., Agarwal, A., Luo, H., Schapire, R.E.: Fast convergence of regularized learning in games. In: Advances in Neural Information Processing Systems, pp. 2989–2997 (2015)
Google Scholar
Viossat, Y.: Evolutionary dynamics and dominated strategies. Econ. Theory Bull. 3(1), 91–113 (2015)
Article MathSciNet Google Scholar
Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148(2), 825–842 (2013)
Article MathSciNet Google Scholar
Vovk, V.G.: Aggregating strategies. In: COLT 1990: Proceedings of the 3rd Workshop on Computational Learning Theory, pp. 371–383 (1990)
Google Scholar
Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1995)
MATH Google Scholar

Download references

Acknowledgment

This work was partially supported from the French National Research Agency (ANR) under grant no. ANR–16–CE33–0004–01 (ORACLESS) and the Huawei HIRP FLAGSHIP project ULTRON.

Author information

Authors and Affiliations

LRI-CNRS, Université de Paris-Sud, Université Paris-Saclay, Orsay, France
Johanne Cohen
AMIB Project, Inria Saclay, 91120, Palaiseau, France
Amélie Héliou
LIX CNRS UMR 7161, Ecole Polytechnique, Université Paris-Saclay, 91120, Palaiseau, France
Amélie Héliou
Univ. Grenoble Alpes, CNRS, Grenoble INP, Inria, LIG, 38000, Grenoble, France
Panayotis Mertikopoulos

Authors

Johanne Cohen
View author publications
You can also search for this author in PubMed Google Scholar
Amélie Héliou
View author publications
You can also search for this author in PubMed Google Scholar
Panayotis Mertikopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Panayotis Mertikopoulos .

Editor information

Editors and Affiliations

University of Salento, Lecce, Italy
Vittorio Bilò
University of L'Aquila, L'Aquila, Italy
Michele Flammini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cohen, J., Héliou, A., Mertikopoulos, P. (2017). Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence. In: Bilò, V., Flammini, M. (eds) Algorithmic Game Theory. SAGT 2017. Lecture Notes in Computer Science(), vol 10504. Springer, Cham. https://doi.org/10.1007/978-3-319-66700-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-66700-3_20
Published: 19 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66699-0
Online ISBN: 978-3-319-66700-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

Abstract

Access this chapter

Similar content being viewed by others

Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games

A Brief Overview of Interdiction and Robust Optimization

Learning a Correlated Equilibrium with Perturbed Regret Minimization

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

Abstract

Access this chapter

Similar content being viewed by others

Analysis of Hannan consistent selection for Monte Carlo tree search in simultaneous move games

A Brief Overview of Interdiction and Robust Optimization

Learning a Correlated Equilibrium with Perturbed Regret Minimization

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation