The Bayesian Pursuit Algorithm: A New Family of Estimator Learning Automata

Zhang, Xuan; Granmo, Ole-Christoffer; Oommen, B. John

doi:10.1007/978-3-642-21827-9_53

Xuan Zhang²²,
Ole-Christoffer Granmo²² &
B. John Oommen^23,22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6704))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

1709 Accesses
11 Citations

Abstract

The fastest Learning Automata (LA) algorithms currently available come from the family of estimator algorithms. The Pursuit algorithm (PST), a pioneering scheme in the estimator family, obtains its superior learning speed by using Maximum Likelihood (ML) estimates to pursue the action currently perceived as being optimal. Recently, a Bayesian LA (BLA) was introduced, and empirical results that demonstrated its advantages over established top performers, including the PST scheme, were reported. The BLA scheme is inherently Bayesian in nature, but it succeeds in avoiding the computational intractability by merely relying on updating the hyper-parameters of sibling conjugate priors, and on random sampling from the resulting posteriors.

In this paper, we integrate the foundational learning principles motivating the design of the BLA, with the principles of the PST. By doing this, we have succeeded in obtaining a completely novel, and rather pioneering, approach to solving LA-like problems, namely, by designing the Bayesian Pursuit algorithm (BPST). As in the BLA, the estimates are truly Bayesian (as opposed to ML) in nature. However, the action selection probability vector of the PST is used for its exploration purposes. Also, unlike the ML estimate, which is usually a single value, the use of a posterior distribution permits us to choose any one of a spectrum of values in the posterior, as the appropriate estimate. Thus, in this paper, we have chosen a 95% percentile value of the posterior (instead of the mean) to pursue the most promising actions. Further, as advocated in [7], the pursuit has been done using both the Linear Reward-Penalty and Reward-Inaction philosophies, leading to the corresponding BPST_RP and BPST_RI schemes respectively.

It turns out that the BPST is superior to the PST, with the BPST_RI being even more robust than the BPST_RP. Moreover, by controlling the learning speed of the BPST, the BPST schemes perform either better or comparable to the BLA. We thus believe that the BPST constitutes a new avenue of research, in which the performance benefits of the PST and the BLA are mutually augmented, opening up for improved performance in a number of applications, currently being tested.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Comprehensive Survey of Estimator Learning Automata and Their Recent Convergence Results

The Hierarchical Continuous Pursuit Learning Automation for Large Numbers of Actions

A formal proof of the 𝜖-optimality of discretized pursuit algorithms

Article 07 May 2015

References

Thompson, W.R.: On the Likelihood that One Unknown Probability Exceeds Another in view of the Evidence of Two Samples. Biometrika 25, 285–294 (1933)
Article MATH Google Scholar
Thathachar, M.A.L., Sastry, P.S.: Estimator Algorithms for Learning Automata. In: Proc. of the Platinum Jubilee Conference on Systems and Signal Processing, Bangalore, India (December 1986)
Google Scholar
Granmo, O.-C.: Solving Two-Armed Bernoulli Bandit Problems Using a Bayesian Learning Automaton. International Journal of Intelligent Computing and Cybernetics (IJICC) 3(2), 207–234 (2010)
Article MathSciNet MATH Google Scholar
Norheim, T., Brådland, T., Granmo, O.-C., Oommen, B.J.: A Generic Solution to Multi-Armed Bernoulli Bandit Problems Based on Random Sampling from Sibling Conjugate Priors. In: Proc. of International Conference on Agents and Artificial Intelligence, Valencia, Spain, pp. 36–44 (January 2010)
Google Scholar
Narendra, K.S., Thathachar, M.A.L.: Learning Automata: An Introduction. Prentice-Hall, Englewood Cliffs (1989)
MATH Google Scholar
Thathachar, M.A.L., Sastry, P.S.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic Publishers, Dordrecht (2004)
Book Google Scholar
Oommen, B.J., Agache, M.: Continuous and Discretized Pursuit Learning Schemes: Various Algorithms and Their Comparison. IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics 31(3), 277–287 (2001)
Article Google Scholar
Jamei, S.M.: Proposing a New Energy Efficient Routing Protocol for Wireless Sensor Networks. International Journal of Computer Science and Network Security 10(2), 241–245 (2010)
Google Scholar
Navid, A.H.F.: Scalable and Energy-Aware Learning Automata-Based Routing Protocols for Wireless Sensor Networks. In: Proc. of IEEE International Conference on Sensor Technologies and Applications, Venice/Mestre, Italy, pp. 570–576 (July 2010)
Google Scholar
Navid, A.H.F., Javadi, H.H.S.: ICLEAR: Energy aware Routing Protocol for WSN using Irregular Cellular Learning Automata. In: Proc. of IEEE Symposium on Industrial Electronics and Applications, Kuala Lumpur, pp. 463–468 (October 2009)
Google Scholar
Song, Y., Zhang, C., Fang, Y.: Stochastic Traffic Engineering in Multihop Cognitive Wireless Mesh Networks. IEEE Transaction on Mobile Computing 9(3), 305–316 (2010)
Article Google Scholar
Gao, F., Zhang, R., Liang, Y.-C., Wang, X.: Design of Learning-Based MIMO Cognitive Radio Systems. IEEE Transaction on Vehicular Technology 59(4), 1707–1720 (2010)
Article Google Scholar
Mamaghani, A.S., Mahi, M., Meybodi, M.R.: A Learning Automaton Based Approach for Data Fragments Allocation in Distributed Database Systems. In: Proc. of IEEE International Conference on Computer and Information Technology, Bradford, pp. 8–12 (September 2010)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of ICT, University of Agder, Grimstad, Norway
Xuan Zhang, Ole-Christoffer Granmo & B. John Oommen
School of Computer Science, Carleton University, Ottawa, Canada
B. John Oommen

Authors

Xuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ole-Christoffer Granmo
View author publications
You can also search for this author in PubMed Google Scholar
B. John Oommen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer and Inforamtion Science, Center for Science and Technology, Syracuse University, 13244-4100, Syracuse, NY, USA
Kishan G. Mehrotra & Chilukuri K. Mohan &
Department of Electrical Engineering and Computer Science, Syracuse University, 13244, Syracuse, NY, USA
Jae C. Oh & Pramod K. Varshney &
Department of Computer Science, Texas State University San Marcos, 601 University Drive, 78666-4616, San Marcos, TX, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Granmo, OC., Oommen, B.J. (2011). The Bayesian Pursuit Algorithm: A New Family of Estimator Learning Automata. In: Mehrotra, K.G., Mohan, C.K., Oh, J.C., Varshney, P.K., Ali, M. (eds) Modern Approaches in Applied Intelligence. IEA/AIE 2011. Lecture Notes in Computer Science(), vol 6704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21827-9_53

Download citation

DOI: https://doi.org/10.1007/978-3-642-21827-9_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21826-2
Online ISBN: 978-3-642-21827-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Bayesian Pursuit Algorithm: A New Family of Estimator Learning Automata

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Comprehensive Survey of Estimator Learning Automata and Their Recent Convergence Results

The Hierarchical Continuous Pursuit Learning Automation for Large Numbers of Actions

A formal proof of the 𝜖-optimality of discretized pursuit algorithms

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The Bayesian Pursuit Algorithm: A New Family of Estimator Learning Automata

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Comprehensive Survey of Estimator Learning Automata and Their Recent Convergence Results

The Hierarchical Continuous Pursuit Learning Automation for Large Numbers of Actions

A formal proof of the 𝜖-optimality of discretized pursuit algorithms

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation