The Relationship Between DL and DRL
Applications
- DL:
No action
- Classification
- Regression
- DRL:
Discrete or Continuous actions
- Robotic Perception and Control
Implementation
Process | DL | DRL |
---|---|---|
Input | Dataset (states \( x \) and labels \( y \)) | Trajectories \( \{(s,a,r,s',d), \cdots\} \) |
NN | CNN (\( w,b \)) | NN (\( \phi,\theta \)) |
Output | estimated \( y_{\_} \) | Q-function value \( Q_\phi \), Policy \( \pi_\theta \) |
Loss Function | Cross-entropy, MSE | Rewards, MSBE (\( Q_\phi - Q^* \)) |
Optimizer | SGD, Adam | DQN, DDPG, TRPO, PPO |
Evaluation | Accuracy compute based on Test-dataset | Gym, Mujoco, Tensorboard |