iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-319-66700-3_20
Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence | SpringerLink
Skip to main content

Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence

  • Conference paper
  • First Online:
Algorithmic Game Theory (SAGT 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10504))

Included in the following conference series:

Abstract

This paper examines the problem of multi-agent learning in \(N\)-person non-cooperative games. For concreteness, we focus on the so-called “hedge” variant of the (EW) algorithm, one of the most widely studied algorithmic schemes for regret minimization in online learning. In this multi-agent context, we show that (a) dominated strategies become extinct (a.s.); and (b) in generic games, pure Nash equilibria are attracting with high probability, even in the presence of uncertainty and noise of arbitrarily high variance. Moreover, if the algorithm’s step-size does not decay too fast, we show that these properties occur at a quasi-exponential rate – that is, much faster than the algorithm’s \({{\mathrm{\mathcal O}}}(1/\sqrt{T})\) worst-case regret guarantee would suggest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    As the second name suggests, this set contains the game’s set of correlated equilibria (CE) – and hence, the game’s set of Nash equilibria as well.

  2. 2.

    In the above \((\alpha _{i};\alpha _{-i})\) is shorthand for \((\alpha _{1},\ldots ,\alpha _{i},\ldots ,\alpha _{N})\), used here to highlight the action of player \(i\) against that of all other players.

  3. 3.

    Note here that (10a) is phrased in terms of the players’ mixed strategy profile \(x(t)\), not the action profile \(\alpha (t) = (\alpha _{i}(t);\alpha _{-i}(t))\) which is chosen based on \(x(t)\) at stage \(t\). To see that (H1) indeed implies (10a), simply recall that \(x_{i}(t) = {{\mathrm{\Lambda }}}_{i}(y_{i}(t-1))\) so

    figure h

    .

  4. 4.

    To the best of our knowledge, the closest result is the elimination of dominated strategies under exponential learning in continuous time [20]; for a survey of the relevant literature, see [25].

  5. 5.

    Recall also the counterexample of [26] discussed in the introduction.

  6. 6.

    This can be shown by an induction argument on the rounds of elimination of dominated strategies as in [18].

References

  1. Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta-algorithm and applications. Theory Comput. 8(1), 121–164 (2012)

    Article  MathSciNet  Google Scholar 

  2. Blum, A., Hajiaghayi, M.T., Ligett, K., Roth, A.: Regret minimization and the price of total anarchy. In: STOC 2008: Proceedings of the 40th Annual ACM Symposium on the Theory of Computing, pp. 373–382. ACM (2008)

    Google Scholar 

  3. Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria (Chap. 4). In: Nisan, N., Roughgarden, T., Tardos, E., Vazirani, V.V. (eds.) Algorithmic Game Theory. Cambridge University Press, Cambridge (2007)

    Google Scholar 

  4. Bubeck, S., Cesa-Bianchi, N.: Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found. Trends Mach. Learn. 5(1), 1–122 (2012)

    Article  Google Scholar 

  5. Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)

    Book  Google Scholar 

  6. Coucheney, P., Gaujal, B., Mertikopoulos, P.: Penalty-regulated dynamics and robust learning procedures in games. Math. Oper. Res. 40(3), 611–633 (2015)

    Article  MathSciNet  Google Scholar 

  7. Foster, D., Vohra, R.V.: Calibrated learning and correlated equilibrium. Games Econ. Behav. 21(1), 40–55 (1997)

    Article  MathSciNet  Google Scholar 

  8. Foster, D.J., Lykouris, T., Sridharan, K., Tardos, E.: Learning in games: robustness of fast convergence. In: Advances in Neural Information Processing Systems, pp. 4727–4735 (2016)

    Google Scholar 

  9. Freund, Y., Schapire, R.E.: Adaptive game playing using multiplicative weights. Games Econ. Behav. 29, 79–103 (1999)

    Article  MathSciNet  Google Scholar 

  10. Goldberg, P.W., Roth, A.: Bounds for the query complexity of approximate equilibria. ACM Trans. Econ. Comput. 4(4), 24:1–24:25 (2016)

    Article  MathSciNet  Google Scholar 

  11. Hall, P., Heyde, C.C.: Martingale Limit Theory and Its Application. Probability and Mathematical Statistics. Academic Press, New York (1980)

    MATH  Google Scholar 

  12. Hannan, J.: Approximation to Bayes risk in repeated play. In: Dresher, M., Tucker, A.W., Wolfe, P. (eds.) Contributions to the Theory of Games. Annals of Mathematics Studies, vol. 39, pp. 97–139. Princeton University Press, Princeton (1957)

    MATH  Google Scholar 

  13. Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)

    Article  MathSciNet  Google Scholar 

  14. Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. J. Comput. Syst. Sci. 71(3), 291–307 (2005)

    Article  MathSciNet  Google Scholar 

  15. Kleinberg, R., Piliouras, G., Tardos, É.: Load balancing without regret in the bulletin board model. Distrib. Comput. 24(1), 21–29 (2011)

    Article  Google Scholar 

  16. Krichene, W., Drighès, B., Bayen, A.M.: Learning Nash equilibria in congestion games. arXiv preprint arXiv:1408.0017 (2014)

  17. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)

    Article  MathSciNet  Google Scholar 

  18. Laraki, R., Mertikopoulos, P.: Higher order game dynamics. J. Econ. Theory 148(6), 2666–2695 (2013)

    Article  MathSciNet  Google Scholar 

  19. Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)

    Article  MathSciNet  Google Scholar 

  20. Mertikopoulos, P., Moustakas, A.L.: The emergence of rational behavior in the presence of stochastic perturbations. Ann. Appl. Probab. 20(4), 1359–1388 (2010)

    Article  MathSciNet  Google Scholar 

  21. Mertikopoulos, P., Sandholm, W.H.: Learning in games via reinforcement and regularization. Math. Oper. Res. 41(4), 1297–1324 (2016)

    Article  MathSciNet  Google Scholar 

  22. Roughgarden, T.: Intrinsic robustness of the price of anarchy. J. ACM (JACM) 62(5), 32 (2015)

    Article  MathSciNet  Google Scholar 

  23. Sandholm, W.H.: Population Games and Evolutionary Dynamics. Economic Learning and Social Evolution. MIT Press, Cambridge (2010)

    MATH  Google Scholar 

  24. Syrgkanis, V., Agarwal, A., Luo, H., Schapire, R.E.: Fast convergence of regularized learning in games. In: Advances in Neural Information Processing Systems, pp. 2989–2997 (2015)

    Google Scholar 

  25. Viossat, Y.: Evolutionary dynamics and dominated strategies. Econ. Theory Bull. 3(1), 91–113 (2015)

    Article  MathSciNet  Google Scholar 

  26. Viossat, Y., Zapechelnyuk, A.: No-regret dynamics and fictitious play. J. Econ. Theory 148(2), 825–842 (2013)

    Article  MathSciNet  Google Scholar 

  27. Vovk, V.G.: Aggregating strategies. In: COLT 1990: Proceedings of the 3rd Workshop on Computational Learning Theory, pp. 371–383 (1990)

    Google Scholar 

  28. Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (1995)

    MATH  Google Scholar 

Download references

Acknowledgment

This work was partially supported from the French National Research Agency (ANR) under grant no. ANR–16–CE33–0004–01 (ORACLESS) and the Huawei HIRP FLAGSHIP project ULTRON.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Panayotis Mertikopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Cohen, J., Héliou, A., Mertikopoulos, P. (2017). Hedging Under Uncertainty: Regret Minimization Meets Exponentially Fast Convergence. In: Bilò, V., Flammini, M. (eds) Algorithmic Game Theory. SAGT 2017. Lecture Notes in Computer Science(), vol 10504. Springer, Cham. https://doi.org/10.1007/978-3-319-66700-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66700-3_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66699-0

  • Online ISBN: 978-3-319-66700-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics