πŸš€ PPO Agent playing for LunarLander-v3

Open In Colab
GitHub Repo

The Github repository contains a trained Proximal Policy Optimization (PPO) agent for the classic control task LunarLander-v3 from Gymnasium.
The model is implemented and trained using the Stable-Baselines3 library.


πŸ“Š Performance

  • Environment: LunarLander-v3
  • Algorithm: PPO
  • Mean Reward: 289.24 Β± 12.88
  • Training Steps: 2.5M

πŸ§‘β€πŸ’» Training

You can run the training pipeline locally or in Colab.

Run in Colab

Click below to open the training notebook:
πŸ‘‰ Open Notebook in Colab

Run Locally

# Clone the repository
git clone https://github.com/AminVilan/RL-PPO-LunarLander-v3.git
cd RL-PPO-LunarLander-v3

# Open the notebook
jupyter notebook src/ppo_lunarlander_training.ipynb

Using the Trained Model

The trained model is available on the Hugging Face Hub. You can load and run it directly:

import gymnasium as gym
from stable_baselines3 import PPO
from huggingface_sb3 import load_from_hub

# Download and load the model from Hugging Face Hub
repo_id = "AminVilan/ppo-LunarLander-v3"
filename = "v01-ppo-LunarLanderV3.zip"
model = load_from_hub(repo_id, filename)

# Create environment
env = gym.make("LunarLander-v3", render_mode="human")

obs, info = env.reset()
done, truncated = False, False

while not (done or truncated):
    action, _ = model.predict(obs)
    obs, reward, done, truncated, info = env.step(action)
    env.render()

env.close()

πŸ“š References


πŸ™Œ If you find this useful, please ⭐ it on Github πŸ€—

GitHub Repo

Downloads last month
19
Video Preview
loading

Evaluation results