This is my notes for CS285 Deep Reinforcement Learning at UC Berkeley.
In the course intro, there are a few questions that interest me:
- Deep learning can handle unstructured environments, complex sensory input and adaptively select features for task at hand.
What does it mean by learning not end-to-end in RL?
Not end-to-end in RL means the recognition part, or understand what is happening, and control part, or decide what action to take, are separate.
What can deep learning & reinforcement learning do well now?
- Acquire high degree of proficiency in domains governed by simple, known rules. E.g. Atari, Go.
- Learn simple skills with raw sensory inputs, given enough experience. E.g. example needed
- Learn from imitating enough human-provided expert behavior. E.g. example needed
What are the challenges of DeepRL?
- Efficiency: DeepRL is slow
- Transfer Learning: how to reuse past experience?
- Not clear what the reward function should be.
- Not clear what the role of predictino should be.
It’s hard to truly understand all the points above if you don’t have much experience in Deep RL, but it’s helpful to keep these in mind through out the course of learning.