Tutorial: Installation and Configuration of MuJoCo, Gym, Baselines

My install environment:

Ubuntu 14.04
Anaconda 3
Python 3.6
TensorFlow-gpu 1.13.1
Nvidia Driver 430.26
Cuda 10.0
Cudnn 7.6

MuJoCo (Multi-Joint dynamics with Contact)

MuJoCo 2.0

Download: https://www.roboti.us/index.html.

Get License: https://www.roboti.us/license.html.

get the computer id:

chmod +x getid_linux
./getid_linux

Unzip the download according to:

mkdir ~/.mujoco
cp mujoco200_linux.zip ~/.mujoco
cd ~/.mujoco
unzip mujoco200_linux.zip
mv mujoco200_linux mujoco200

Copy mjkey.txt to ~/.mujoco and ~/.mujuco/mujoco200/bin:

cp mjkey.txt ~/.mujoco
cp mjkey.txt ~/.mujoco/mujoco200/bin

Add the environment variables to ~/.bashrc:

gedit ~/.bashrc

export LD_LIBRARY_PATH=/home/csy/.mujoco/mujoco200/bin${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export MUJOCO_KEY_PATH=/home/csy/.mujoco${MUJOCO_KEY_PATH}

Test MuJoCo:

cd ~/.mujoco/mujoco200/bin
./simulate ../model/humanoid.xml

mujoco-py

Create a virtual environment use Anaconda and activate the environment:
```
conda create -n mujoco-gym python=3.6
conda activate mujoco-gym
```

Install mujoco-py:

cd drl/mujoco-gym
git clone https://github.com/openai/mujoco-py.git
cd mujoco-py
pip install -e .

If something error, for example, no file named 'patchelf', then, install it:

Download: https://nixos.org/releases/patchelf/patchelf-0.9/patchelf-0.9.tar.gz
cd patchelf-0.9
./configure
make
sudo make install

patchelf --version

Test mujoco-py:

$ python
import mujoco_py
import os
mj_path, _ = mujoco_py.utils.discover_mujoco()
xml_path = os.path.join(mj_path, 'model', 'humanoid.xml')
model = mujoco_py.load_model_from_path(xml_path)
sim = mujoco_py.MjSim(model)

print(sim.data.qpos)
# [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

sim.step()
print(sim.data.qpos)
# [-2.09531783e-19  2.72130735e-05  6.14480786e-22 -3.45474715e-06
#   7.42993721e-06 -1.40711141e-04 -3.04253586e-04 -2.07559344e-04
#   8.50646247e-05 -3.45474715e-06  7.42993721e-06 -1.40711141e-04
#  -3.04253586e-04 -2.07559344e-04 -8.50646247e-05  1.11317030e-04
#  -7.03465386e-05 -2.22862221e-05 -1.11317030e-04  7.03465386e-05
#  -2.22862221e-05]

Gym

Installation

Install Gym:
```
pip install gym[all]
```

Test Gym:

import gym
env = gym.make('Humanoid-v2')

from gym import envs
print(envs.registry.all())    # print the available environments

print(env.action_space)
print(env.observation_space)
print(env.observation_space.high)
print(env.observation_space.low)

for i_episode in range(200):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()    # take a random action
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break
env.close()

If something error like this:

ERROR: GLEW initalization error: Missing GL version

Install the following package:

sudo apt-get install libglew-dev

Then, add the following path to ~/.bashrc:

export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so

Usage of Gym

Gym is a toolkit (or environments) for developing and comparing RL algorithms.

Every environment has the following functions:

step(action): the function returns four values.
- observation: observation of the environment after action, eg. pixel data, joint angles.
- reward: amount of reward achieved by the previous action.
- done: if True, this episode has terminated, and call the function reset().
- info: diagnostic information useful for debugging.
reset(): return the initial observation.
render(mode = 'human')
close()

Also, we can create new environments for Gym, reference here.

Baselines

Install some require packages:

sudo apt-get update && sudo apt-get install cmake libopenmpi-dev python3-dev zlib1g-dev

Download baselines:

git clone https://github.com/openai/baselines.git
cd baselines

Install TensorFlow:

nvidia-smi   # make sure the Nvidia dirver has been installed
cat /usr/local/cuda/version.txt    # check the version of cuda
cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2    # check the version of cudnn

pip install tensorflow-gpu -i https://pypi.doubanio.com/simple/    # this can increase the download speed

Install baselines:
```
pip install -e .
```

Test the installation:

pip install pytest    # this package is used to test the 'test*' files in baselines
pytest    # maybe some errors, packages need to be installed, such as matplotlib, pandas, etc.

Training models:

python -m baselines.run --alg=<name of the algorithm> --env=<environment_id> [additional arguments]

python -m baselines.run --alg=ppo2 --env=Humanoid-v2 --network=mlp --num_timesteps=2e7    # training PPO with MuJoCo Humanoid

Packages in mujoco-gym viual environment

After all the above installed, the file environment.yml of the environment is as follows:

name: mujoco-gym
channels:
  - defaults
dependencies:
  - ca-certificates=2019.5.15=0
  - certifi=2019.3.9=py36_0
  - libedit=3.1.20181209=hc058e9b_0
  - libffi=3.2.1=hd88cf55_4
  - libgcc-ng=9.1.0=hdf63c60_0
  - libstdcxx-ng=9.1.0=hdf63c60_0
  - ncurses=6.1=he6710b0_1
  - openssl=1.1.1c=h7b6447c_1
  - pip=19.1.1=py36_0
  - python=3.6.8=h0371630_0
  - readline=7.0=h7b6447c_5
  - setuptools=41.0.1=py36_0
  - sqlite=3.28.0=h7b6447c_0
  - tk=8.6.8=hbc83047_0
  - wheel=0.33.4=py36_0
  - xz=5.2.4=h14c3975_4
  - zlib=1.2.11=h7b6447c_3
  - pip:
    - absl-py==0.7.1
    - alabaster==0.7.12
    - astor==0.8.0
    - atari-py==0.1.15
    - atomicwrites==1.3.0
    - attrs==19.1.0
    - babel==2.7.0
    - backcall==0.1.0
    - bleach==1.5.0
    - cffi==1.12.3
    - chardet==3.0.4
    - click==7.0
    - cloudpickle==1.2.1
    - cycler==0.10.0
    - cython==0.29.10
    - decorator==4.4.0
    - docutils==0.14
    - enum34==1.1.6
    - filelock==3.0.12
    - future==0.17.1
    - gast==0.2.2
    - glfw==1.8.1
    - grpcio==1.20.1
    - gym==0.12.5
    - h5py==2.9.0
    - html5lib==0.9999999
    - idna==2.8
    - imagehash==4.0
    - imageio==2.5.0
    - imagesize==1.1.0
    - importlib-metadata==0.18
    - ipdb==0.12
    - ipython==7.5.0
    - ipython-genutils==0.2.0
    - jedi==0.13.3
    - jinja2==2.10.1
    - joblib==0.13.2
    - keras-applications==1.0.7
    - keras-preprocessing==1.0.9
    - kiwisolver==1.1.0
    - lockfile==0.12.2
    - markdown==3.1.1
    - markupsafe==1.1.1
    - matplotlib==3.1.0
    - mock==3.0.5
    - more-itertools==7.0.0
    - mpi4py==3.0.2
    - numpy==1.16.4
    - numpydoc==0.9.1
    - opencv-python==4.1.0.25
    - packaging==19.0
    - pandas==0.24.2
    - parso==0.4.0
    - pexpect==4.7.0
    - pickleshare==0.7.5
    - pillow==6.0.0
    - pluggy==0.12.0
    - prompt-toolkit==2.0.9
    - protobuf==3.7.1
    - ptyprocess==0.6.0
    - py==1.8.0
    - pycparser==2.19
    - pyglet==1.3.2
    - pygments==2.4.2
    - pyparsing==2.4.0
    - pytest==4.6.3
    - pytest-instafail==0.3.0
    - python-dateutil==2.8.0
    - pytz==2019.1
    - pywavelets==1.0.3
    - requests==2.22.0
    - scipy==1.3.0
    - six==1.12.0
    - snowballstemmer==1.2.1
    - sphinx==2.1.2
    - sphinx-rtd-theme==0.4.3
    - sphinxcontrib-applehelp==1.0.1
    - sphinxcontrib-devhelp==1.0.1
    - sphinxcontrib-htmlhelp==1.0.2
    - sphinxcontrib-jsmath==1.0.1
    - sphinxcontrib-qthelp==1.0.2
    - sphinxcontrib-serializinghtml==1.1.3
    - tensorboard==1.13.1
    - tensorflow-estimator==1.13.0
    - tensorflow-gpu==1.13.1
    - tensorflow-tensorboard==0.4.0
    - termcolor==1.1.0
    - tqdm==4.32.2
    - traitlets==4.3.2
    - urllib3==1.25.3
    - wcwidth==0.1.7
    - werkzeug==0.15.4
    - zipp==0.5.1
prefix: /home/csy/anaconda3/envs/mujoco-gym