Knowledge Background
Solid mathematical background
- Probability and Statistics
- random variables
- Bayes’ theorem
- chain rule of probability
- expected values
- standard deviations
- importance sampling
- Multivariate Calculus
- gradients
- Taylor series expansions
Deep Learning
- Standard architectures
- Common regularizers
- Normalization
- Optimizers
Deep Learning Library
- States
- Actions
- Trajectories
- Policy
- Reward
- Value Function
- Action-Value Function
Learn by Doing
- Write your own implementations
- Implement the simplest algorithms first
- Focus on understanding
- Look for in papers
- Don’t overfit to paper details
- Iterate fast in simple environments (eg. CartPole-v0)
- Measure everything (mean/std/min/max for cumulative rewards, episode lengths, value function estimates, loss for objectives)
- Scale experiments when the algorithm work
Developing a Research Project
- Start by exploring the literature to become aware of topics in the field
- Approaches to idea-generation
- Improving on an existing approach
- Focusing on unsolved benchmarks
- Create a new problem setting
Doing Rigorous Research in RL
- Set up fair comparisons (all else equal)
- Remove stochasticity as a confounder
- Run high-integrity experiments
- Check each claim separately
References