Jump to content

Problem Statement:
While attempting to train a Convolutional Neural Network (CNN) model using TensorFlow/Keras, the model runs on the CPU instead of the GPU, causing high CPU usage and prolonged training time. This issue persists despite having a compatible GPU available.

Details:

  1. Environment Setup:

    • Framework: TensorFlow
    • GPU Hardware: Geforce RTX 4050 (mobile)
    • CPU Hardware: Intel i5-13420H
    • Operating SystemWindows 11 
    • Python Version: Tried on both ( 3.10.5 and 3.12 )
    • Additional Libraries: Installed and set up CUDA, cuDNN
  2. Observed Behavior:

    • Upon running model.fit(x=x_train, y=y_train, batch_size=128, epochs=100, validation_data=(x_test, y_test)), the system utilizes the CPU at 100%, leading to slower training times.
    • Despite having a compatible GPU, the training process does not leverage GPU resources, as confirmed by the absence of GPU activity in monitoring tools like task manager and nvidia-smi.
  3. Expected Behavior:

    • The model training should utilize the GPU to accelerate computations and reduce CPU usage.
    • With GPU-enabled training, the training process should be faster, and the CPU load should remain minimal.
  4. Attempted Solutions:

    • Reinstalled TensorFlow with GPU support (tensorflow-gpu).
    • Checked CUDA and cuDNN installation to ensure compatibility with the TensorFlow version.
    • Created a different environment variable using (anaconda) for py3.10.5 and did the same steps again.
    • Tried installing WSL (assuming in case if it wasn't compitible with windows and needed linux insted)
  5. Technical Constraints:

    • Ensuring TensorFlow, CUDA, and cuDNN versions are compatible.
    • Checking for any conflicting CPU-only dependencies that may override GPU settings.

trainmodel.ipynb

Link to post
Share on other sites

import torch
for i in range(torch.cuda.device_count()):
   print(torch.cuda.get_device_properties(i).name)

Punch this line in and run it, see what pops up. Basically it tells you devices that torch can choose from.

 

Fix the indentations yourself tho, copied it using a phone.

 

If it throws out an exception then perhaps your torch ain't working 

Link to post
Share on other sites

Appreciated @Tridefender i ran it but didn't print anything, So, i ran 

import torch
print(torch.__version__)
print(torch.cuda.is_available())

and the output is:
2.2.1+cpu
1
False

So, i get it something is wrong but don't know the solution 🙂  
Link to post
Share on other sites

Just now, Vatsal_09 said:

Appreciated @Tridefender i ran it but didn't print anything, So, i ran 

import torch
print(torch.__version__)
print(torch.cuda.is_available())

and the output is:
2.2.1+cpu
1
False

So, i get it something is wrong but don't know the solution 🙂  

Are you sure the environment has cuda?? Because it says FALSE

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×