iBet uBet
web content aggregator. Adding the entire web to your favor.
Link to original content:
https://www.pubpeer.com/search?q=title:(Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs.)