Puyuan Peng

PhD Student in Computer Science

Posts by Tag

RL 12
Notes 12

RL

Deep RL 12 Reinforcement Learning and Control as Probabilistic Inference

less than 1 minute read

Please checkout Professor Sergey Levine’s excellent tutorial: Levine 18’

Deep RL 11 Model-Based Policy Learning

9 minute read

In this section, we study how to learn policies utilize the known (learned) dynamics. Why do we need to learn a policy? What’s wrong with MPC in the previous...

Deep RL 10 Model-based Reinforcement Learning

12 minute read

Previous lecture is mainly about how to plan actions to take when the dynamics is known. In this lecture, we study how to learn the dynamics. We will also in...

Deep RL 9 Model-based Planning

12 minute read

Let’s recall the reinforcement learning goal — we want to maximaze the expected reward (or expected discounted reward in the infinite horizon case)

Deep RL 8 Advanced Policy Gradient

9 minute read

At the end of previous lecture, we talked about the issues with Q-learning, one of them is that it’s not directly optimizing the expected return and it can t...

Deep RL 7 Q-learning

10 minute read

In this section we extend the online Q-iteration algorithm in the previous lecture by identifying the potential issues and introducing solutions. The improve...

Deep RL 6 Value Function Methods

9 minute read

Previously we studied policy gradient methods, which proposes a parametric policy and optimize it to achieve better expected reward. Then we introduce actor-...

Deep RL 5 Actor Critic

11 minute read

Actor-critic algorithms build on the policy gradient framwork that we discussed in the previous lecture, but also augment it with learning value functions an...

Deep RL 4 Policy Gradient

11 minute read

In this lecture, we will study the classic policy gradient methods, which includes the REINFORCE algorithm, off-policy policy gradient method, and several co...

Deep RL 3 Intro to RL

13 minute read

This is an introduction to reinforcement learning, including core concepts, the general goal, the general framework, introduction and comparison of different...

Deep RL 2 Imitation Learning

7 minute read

The framework of imitation learning tackles reinforcement learning as a supervised learning problem.

Deep RL 1 Introduction

less than 1 minute read

This is my notes for CS285 Deep Reinforcement Learning at UC Berkeley.

Notes

Deep RL 12 Reinforcement Learning and Control as Probabilistic Inference

less than 1 minute read

Please checkout Professor Sergey Levine’s excellent tutorial: Levine 18’

Deep RL 11 Model-Based Policy Learning

9 minute read

In this section, we study how to learn policies utilize the known (learned) dynamics. Why do we need to learn a policy? What’s wrong with MPC in the previous...

Deep RL 10 Model-based Reinforcement Learning

12 minute read

Previous lecture is mainly about how to plan actions to take when the dynamics is known. In this lecture, we study how to learn the dynamics. We will also in...

Deep RL 9 Model-based Planning

12 minute read

Let’s recall the reinforcement learning goal — we want to maximaze the expected reward (or expected discounted reward in the infinite horizon case)

Deep RL 8 Advanced Policy Gradient

9 minute read

At the end of previous lecture, we talked about the issues with Q-learning, one of them is that it’s not directly optimizing the expected return and it can t...

Deep RL 7 Q-learning

10 minute read

In this section we extend the online Q-iteration algorithm in the previous lecture by identifying the potential issues and introducing solutions. The improve...

Deep RL 6 Value Function Methods

9 minute read

Previously we studied policy gradient methods, which proposes a parametric policy and optimize it to achieve better expected reward. Then we introduce actor-...

Deep RL 5 Actor Critic

11 minute read

Actor-critic algorithms build on the policy gradient framwork that we discussed in the previous lecture, but also augment it with learning value functions an...

Deep RL 4 Policy Gradient

11 minute read

In this lecture, we will study the classic policy gradient methods, which includes the REINFORCE algorithm, off-policy policy gradient method, and several co...

Deep RL 3 Intro to RL

13 minute read

This is an introduction to reinforcement learning, including core concepts, the general goal, the general framework, introduction and comparison of different...

Deep RL 2 Imitation Learning

7 minute read

The framework of imitation learning tackles reinforcement learning as a supervised learning problem.

Deep RL 1 Introduction

less than 1 minute read

This is my notes for CS285 Deep Reinforcement Learning at UC Berkeley.