DCL Seminar: Donghwan Lee - Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD2

Event Type
Seminar/Symposium
Sponsor
Decision and Control Laboratory, Coordinated Science Laboratory
Location
CSL Auditorium, Room B02
Date
April 25, 2018 3:00 PM
Speaker
Donghwan Lee, Ph.D., University of Illinois UIUC
Cost
Registration
Contact
Linda Stimson
Email
ls9@illinois.edu
Phone
217-333-9449

Decision and Control Lecture Series

Coordinated Science Laboratory

 

“Primal-Dual Algorithm for Distributed
Reinforcement Learning: Distributed GTD2”


Donghwan Lee, Ph.D.

Postdoctoral Associate

University of Illinois UIUC

 

Wednesday, April 25, 2018

3:00 p.m. to 4:00 p.m.

CSL Auditorium (B02)

____________________________________________________________________________________________________________________________________________________________________

Abstract:

The goal of this paper is to study a distributed version of the gradient temporal-difference (GTD) learning algorithm for multi-agent Markov decision processes (MDPs). The temporal-difference (TD) learning is a reinforcement learning (RL) algorithm which learns an infinite horizon discounted cost function (or value function) for a given fixed policy without the model knowledge. In the distributed RL case each agent receives local reward through a local processing. Information exchange over sparse communication network allows the agents to learn the global value function corresponding to a global reward, which is a sum of local rewards. In this paper, the problem is converted into a constrained convex optimization problem with a consensus constraint. Then, we propose a primal-dual distributed GTD algorithm and prove that it almost surely converges to a set of stationary points of the optimization problem.

Bio:

Donghwan Lee is a postdoctoral researcher hosted by Prof. Naira Hovakimyan in the Department of Mechanical Science and Engineering at the University of Illinois, Urbana-Champaign. He received my Ph.D. in Electrical and Computer Engineering from Purdue University in 2017. His research interests lie broadly in the areas of optimization and control theory. His most recent research interests are reinforcement learning and its control applications with human interactions.