Jump to content

Help with RL using stable baselines 3 [Python]

ViniciusSilvestre

I'm trying to use sb3 for a RL project. The "issue" im facing is the following:

i have two gpu's in my system right now a RTX 3070 and a RTX 2070 super. the 3070 is running two instance of the env at the same time while the 2070 is running one instance of the env (limitations due to the way i'm doing a grid search for the optimal rewards).

during the episodes everything runs fine, but when the 2070 reaches the rollout stage, it moves all the processing to the 3070 and stays at 0% usage. this is despite the fact i specified the device the model should run at.

The model is trying to learn how to play a game i developed. The way thing are now its still faster than running with just the 3070 because of VRAM limitations, it can only fit two instances of the model in the 8GB frame buffer, but i cant help but feel im leaving performance behind.

any ideas what could be happening? and if so, how to fix it? i tried specifying the device even more with:


model = PPO('MultiInputPolicy', env, verbose=1, tensorboard_log=log_path, device='cuda:1')
with torch.cuda.device('cuda:1'):

        model.learn(total_timesteps=80000, callback=callback)
>

but it didn't help
 

 

A1100a.txt Env.py PlagueGame.py Run Model 1.py Run Model 2.py Run Model 3.py

Link to comment
Share on other sites

Link to post
Share on other sites

One dirty hack you could do is to use docker and specify which GPU the container is going to use.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×