Abstract
This paper examines the problem of multi-agent learning in \(N\)-person non-cooperative games. For concreteness, we focus on the so-called “hedge” variant of the (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that (a) dominated strategies become extinct (a.s.); and (b) in generic games, pure Nash equilibria are attracting with high probability, even in the presence of uncertainty and noise of arbitrarily high variance. Moreover, if the algorithm’s step-size does not decay too fast, we show that these properties occur at a quasi-exponential rate – that is, much faster than the algorithm’s \({{\mathrm{\mathcal O}}}(1/\sqrt{T})\) worst-case regret guarantee would suggest.
Similar content being viewed by others
Notes
- 1.
As the second name suggests, this set contains the game’s set of correlated equilibria (CE) – and hence, the game’s set of Nash equilibria as well.
- 2.
In the above \((\alpha _{i};\alpha _{-i})\) is shorthand for \((\alpha _{1},\ldots ,\alpha _{i},\ldots ,\alpha _{N})\), used here to highlight the action of player \(i\) against that of all other players.
- 3.
Note here that (10a) is phrased in terms of the players’ mixed strategy profile \(x(t)\), not the action profile \(\alpha (t) = (\alpha _{i}(t);\alpha _{-i}(t))\) which is chosen based on \(x(t)\) at stage \(t\). To see that (H1) indeed implies (10a), simply recall that \(x_{i}(t) = {{\mathrm{\Lambda }}}_{i}(y_{i}(t-1))\) so
.
- 4.
- 5.
Recall also the counterexample of [26] discussed in the introduction.
- 6.
This can be shown by an induction argument on the rounds of elimination of dominated strategies as in [18].
References
Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)
Blum, A., Hajiaghayi, M.T., Ligett, K., Roth, A.: Regret minimization and the price of total anarchy. In: STOC 2008: Proceedings of the 40th Annual ACM Symposium on the Theory of Computing, pp. 373–382. ACM (2008)
Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria (Chap. 4). In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds.) Algorithmic Game Theory. Cambridge University Press, Cambridge (2007)
Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)
Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40(3), 611–633 (2015)
Foster, D., Vohra, R.V.: Calibrated learning and correlated equilibrium. Games Econ. Behav. 21(1), 40–55 (1997)
Foster, D.J., Lykouris, T., Sridharan, K., Tardos, E.: Learning in games: robustness of fast convergence. In: Advances in Neural Information Processing Systems, pp. 4727–4735 (2016)
Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)
Goldberg, P.W., Roth, A.: Bounds for the query complexity of approximate equilibria. ACM Trans. Econ. Comput. 4(4), 24:1–24:25 (2016)
Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. Academic Press, New York (1980)
Hannan, J.: Approximation to Bayes risk in repeated play. In: Dresher, M., Tucker, A.W., Wolfe, P. (eds.) Contributions to the Theory of Games. Annals of Mathematics Studies, vol. 39, pp. 97–139. Princeton University Press, Princeton (1957)
Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)
Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3), 291–307 (2005)
Kleinberg, R., Piliouras, G., Tardos, É.: Load balancing without regret in the bulletin board model. Distrib. Comput. 24(1), 21–29 (2011)
Krichene, W., Drighès, B., Bayen, A.M.: Learning Nash equilibria in congestion games. arXiv preprint arXiv:1408.0017 (2014)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Laraki, R., Mertikopoulos, P.: Higher order game dynamics. J. Econ. Theory 148(6), 2666–2695 (2013)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
Mertikopoulos, P., Moustakas, A.L.: The emergence of rational behavior in the presence of stochastic perturbations. Ann. Appl. Probab. 20(4), 1359–1388 (2010)
Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41(4), 1297–1324 (2016)
Roughgarden, T.: Intrinsic robustness of the price of anarchy. J. ACM (JACM) 62(5), 32 (2015)
Sandholm, W.H.: Population Games and Evolutionary Dynamics. Economic Learning and Social Evolution. MIT Press, Cambridge (2010)
Syrgkanis, V., Agarwal, A., Luo, H., Schapire, R.E.: Fast convergence of regularized learning in games. In: Advances in Neural Information Processing Systems, pp. 2989–2997 (2015)
Viossat, Y.: Evolutionary dynamics and dominated strategies. Econ. Theory Bull. 3(1), 91–113 (2015)
Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148(2), 825–842 (2013)
Vovk, V.G.: Aggregating strategies. In: COLT 1990: Proceedings of the 3rd Workshop on Computational Learning Theory, pp. 371–383 (1990)
Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1995)
Acknowledgment
This work was partially supported from the French National Research Agency (ANR) under grant no. ANR–16–CE33–0004–01 (ORACLESS) and the Huawei HIRP FLAGSHIP project ULTRON.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Cohen, J., Héliou, A., Mertikopoulos, P. (2017). Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence. In: Bilò, V., Flammini, M. (eds) Algorithmic Game Theory. SAGT 2017. Lecture Notes in Computer Science(), vol 10504. Springer, Cham. https://doi.org/10.1007/978-3-319-66700-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-66700-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66699-0
Online ISBN: 978-3-319-66700-3
eBook Packages: Computer ScienceComputer Science (R0)