Jump to content

I decided since I set up a linux box for the 30c/60t server that I have that I might as well throw one of my lower power GPU's in there and see what kind of PPD bump I might see from going with Linux. After much trial and error I got all the drivers installed, openCL installed, and the slot to start folding. Yay! Then I noticed that the clocks were stuck at 1189mhz instead of the max 1721mhz that this little Quadro P2000 can do.

 

Long story short, I got everything configured with "nvidia-settings" and "nvidia-smi" but after folding for 20 minutes or so the card will drop clocks down to 607mhz (P8) and stay there. The following are the settings I configured to get it working at full potential, to be honest I can't remember if the problem was happening when I left the card to do its own thing at 1189mhz.

 

//Sets power state to P0 (Maximum Performance)
nvidia-settings -a [gpu:0]/GPUPowerMizerMode=1

//Enables fan control
nvidia-settings -a [gpu:0]/GPUFanControlState=1

//Sets fan speed to 100%
nvidia-settings -a [fan:0]/GPUTargetFanSpeed=100

//Enables persistance mode (On by default but set it anyway just in case)
nvidia-smi -pm 1

//Sets "Application Clocks" to the highest (1721mhz core, 3504mhz mem) - Increases clock above the standard 1189mhz for CUDA workloads
nvidia-smi -ac 3504,1721

 

I know these

I'm running Debian 10.1 with Nvidia driver 418.74, the system does have a monitor attached to it, and I have tried decreasing the clocks down 100mhz. I know this card can handle that 1721mhz clock though as it does it just fine when folding in the Windows 10 machine that it came out of. I think I am in over my head and at the end of my Google-fu. :)

 

If anyone has any ideas I'd appreciate them.

Link to comment
https://linustechtips.com/topic/1119530-folding-under-linux-drops-gpu-clocks/
Share on other sites

Link to post
Share on other sites

23 hours ago, catawalks said:

I decided since I set up a linux box for the 30c/60t server that I have that I might as well throw one of my lower power GPU's in there and see what kind of PPD bump I might see from going with Linux. After much trial and error I got all the drivers installed, openCL installed, and the slot to start folding. Yay! Then I noticed that the clocks were stuck at 1189mhz instead of the max 1721mhz that this little Quadro P2000 can do.

 

Long story short, I got everything configured with "nvidia-settings" and "nvidia-smi" but after folding for 20 minutes or so the card will drop clocks down to 607mhz (P8) and stay there. The following are the settings I configured to get it working at full potential, to be honest I can't remember if the problem was happening when I left the card to do its own thing at 1189mhz.

 


//Sets power state to P0 (Maximum Performance)
nvidia-settings -a [gpu:0]/GPUPowerMizerMode=1

//Enables fan control
nvidia-settings -a [gpu:0]/GPUFanControlState=1

//Sets fan speed to 100%
nvidia-settings -a [fan:0]/GPUTargetFanSpeed=100

//Enables persistance mode (On by default but set it anyway just in case)
nvidia-smi -pm 1

//Sets "Application Clocks" to the highest (1721mhz core, 3504mhz mem) - Increases clock above the standard 1189mhz for CUDA workloads
nvidia-smi -ac 3504,1721

 

I know these

I'm running Debian 10.1 with Nvidia driver 418.74, the system does have a monitor attached to it, and I have tried decreasing the clocks down 100mhz. I know this card can handle that 1721mhz clock though as it does it just fine when folding in the Windows 10 machine that it came out of. I think I am in over my head and at the end of my Google-fu. :)

 

If anyone has any ideas I'd appreciate them.

Yeah ... nvidia drivers are a bitch. At least setting the application clocks works for the Tesla and Quadros. It doesn’t work for Pascal/Turing so on those you have to use nvidia-settings which is an X application and while it does have a CLI variant it expects a working X Display and has a fit if it can’t find one.

 

What I do in Ubuntu, which should also work for Tesla/Quadros under Debian is:

 

use nxidia-xconfig to generate an /etc/X11/xorg.conf file then use ps -ef to find the X authority file which then can be used on the command line with nvidia-settings to set the GPU Clock offset to the value desired.

 

ending up with:

DISPLAY=:0 XAUTHORITY=/run/user/121/gdm/Xauthority nvidia-settings -a [gpu:1]/GPUGraphicsClockOffsetAllPerformanceLevels=75

for one of my systems

FaH BOINC HfM

Bifrost - 6 GPU Folding Rig  Linux Folding HOWTO Folding Remote Access Folding GPU Profiling ToU Scheduling UPS

Systems:

desktop: Lian-Li O11 Air Mini; Asus ProArt x670e WiFi; Ryzen 9 7950x; EVGA 240 CLC; 2 x 48GB DDR5-6000; 2 x Samsung 980 Pro 500GB PCIe3 NVMe; 2 x 8TB NAS; AMD FirePro W4100; Corsair SFF750

nas1: Fractal Node 804; SuperMicro X10sl7-f; Xeon e3-1231v3; 4 x 8GB DDR3-1666 ECC; 2 x 250GB Samsung EVO Pro SSD; 7 x 4TB Seagate NAS; Corsair HX650i

nas2: Synology DS-123j; 2 x 6TB WD Red Plus NAS

nas3: Synology DS-224+; 2 x 12TB Seagate NAS

dcn01: Fractal Meshify S2; Gigabyte Aorus ax570 Master; Ryzen 9 5900x; Noctua NH-D15; 4 x 16GB DDR4-3200; 512GB NVMe; 2 x Zotac AMP 4070ti; Corsair RM750Mx

dcn02: Fractal Meshify S2; Gigabyte ax570 Pro WiFi; Ryzen 9 3950x; Noctua NH-D15; 2 x 16GB DDR4-3200; 128GB NVMe; 2 x Zotac AMP 4070ti; Corsair RM750x

dcn03: Fractal Pop Silent XL; Gigabyte Aorus z370 Gaming 5; i9-9900k; BeQuiet! PureRock 2 Black; 2 x 8GB DDR4-2400; 128GB SATA m.2; MSI 4070 Ti Super Gaming X; MSI 4070 Ti Super Ventus 2; Corsair RM750e

dcn04: Fractal Meshify C; Gigabyte Aorus ax570 Master; Ryzen 9 5950x; BeQuiet! PureRock 2; 2 x 16GB DDR4-3200; 250GB NVMe; Asus RTX 4070 Dual OC; MSI 4070 Ti Super Ventus 2; Corsair TX650m

dcn05: Fractal Define S; Gigabyte Aorus b450m; Ryzen 7 2700; AMD Wraith; 2 x 8GB DDR 4-3200; 128GB SATA NVMe; Gigabyte Gaming RTX 4080 Super; Corsair TX750m

dcn06: Fractal Focus G Mini; Gigabyte Aorus b450m; Ryzen 7 2700; AMD Wraith; 2 x 8GB DDR 4-3200; 128GB SSD; Gigabyte Gaming RTX 4080 Super; Corsair CX650m

Link to post
Share on other sites

56 minutes ago, Gorgon said:

Yeah ... nvidia drivers are a bitch. At least setting the application clocks works for the Tesla and Quadros. It doesn’t work for Pascal/Turing so on those you have to use nvidia-settings which is an X application and while it does have a CLI variant it expects a working X Display and has a fit if it can’t find one.

 

What I do in Ubuntu, which should also work for Tesla/Quadros under Debian is:

 

use nxidia-xconfig to generate an /etc/X11/xorg.conf file then use ps -ef to find the X authority file which then can be used on the command line with nvidia-settings to set the GPU Clock offset to the value desired.

 

ending up with:


DISPLAY=:0 XAUTHORITY=/run/user/121/gdm/Xauthority nvidia-settings -a [gpu:1]/GPUGraphicsClockOffsetAllPerformanceLevels=75

for one of my systems

That'll help me do more of the configuration through SSH which I was attempting to do. I didn't know enough the X authority file or even where to look or what it all even meant. But I'll check that out when I get back to work on Monday. As for the clocks offset, would that be like overclocking the card? And would that affect which power state it was in?

 

For whatever reason, the issue that it looks like I'm having now, is that after about 20 minutes of folding the card drops from the maximum performance state (P0) all the way down to one of the lower power states (P8). The power states on this card when I queried them with nvidia-smi were from P0 down to P12 as the lowest. So the state it's getting dropped in to is really low.

 

It might be my own lack of knowledge when installing the necessary drivers while trying to get the FAH slot to start folding. After the folding event is over, I think I might try to wipe the machine and start over. Do you have any specific list of things you install when setting up a linux folding rig? I tried to follow one of the tutorials on here for the most part but because I was using Debian I think some extra steps are needed.

Link to post
Share on other sites

14 minutes ago, catawalks said:

That'll help me do more of the configuration through SSH which I was attempting to do. I didn't know enough the X authority file or even where to look or what it all even meant. But I'll check that out when I get back to work on Monday. As for the clocks offset, would that be like overclocking the card? And would that affect which power state it was in?

 

For whatever reason, the issue that it looks like I'm having now, is that after about 20 minutes of folding the card drops from the maximum performance state (P0) all the way down to one of the lower power states (P8). The power states on this card when I queried them with nvidia-smi were from P0 down to P12 as the lowest. So the state it's getting dropped in to is really low.

 

It might be my own lack of knowledge when installing the necessary drivers while trying to get the FAH slot to start folding. After the folding event is over, I think I might try to wipe the machine and start over. Do you have any specific list of things you install when setting up a linux folding rig? I tried to follow one of the tutorials on here for the most part but because I was using Debian I think some extra steps are needed.

Yes, the nvidia-settings command is what is needed to overclock on Pascal/Turing and AFAIK it should work for Tesla too. But in the GeForce world you also have to set bits 2 and 3 of the CoolBits to enable Fan Control and Clock offsets which does not appear to be necessary on the Quadros and Teslas.

 

Ubuntu is Debian based so the syntax should be almost identical but the location of some files might be different.

 

The NVidia developer forums are where you can likely find more info on setting the performance states and making them stick. If you can’t find the information there you will likely get a quick answer to a post as your using a pro card.

 

Like you, I generally set prefer maximum performance and a manual Fan and overclock and I haven’t seen that behaviour whee it drops down but again it might be a feature of the pro cards. I’ve seen some posts where people had to set an application profile to get the p state to stick

FaH BOINC HfM

Bifrost - 6 GPU Folding Rig  Linux Folding HOWTO Folding Remote Access Folding GPU Profiling ToU Scheduling UPS

Systems:

desktop: Lian-Li O11 Air Mini; Asus ProArt x670e WiFi; Ryzen 9 7950x; EVGA 240 CLC; 2 x 48GB DDR5-6000; 2 x Samsung 980 Pro 500GB PCIe3 NVMe; 2 x 8TB NAS; AMD FirePro W4100; Corsair SFF750

nas1: Fractal Node 804; SuperMicro X10sl7-f; Xeon e3-1231v3; 4 x 8GB DDR3-1666 ECC; 2 x 250GB Samsung EVO Pro SSD; 7 x 4TB Seagate NAS; Corsair HX650i

nas2: Synology DS-123j; 2 x 6TB WD Red Plus NAS

nas3: Synology DS-224+; 2 x 12TB Seagate NAS

dcn01: Fractal Meshify S2; Gigabyte Aorus ax570 Master; Ryzen 9 5900x; Noctua NH-D15; 4 x 16GB DDR4-3200; 512GB NVMe; 2 x Zotac AMP 4070ti; Corsair RM750Mx

dcn02: Fractal Meshify S2; Gigabyte ax570 Pro WiFi; Ryzen 9 3950x; Noctua NH-D15; 2 x 16GB DDR4-3200; 128GB NVMe; 2 x Zotac AMP 4070ti; Corsair RM750x

dcn03: Fractal Pop Silent XL; Gigabyte Aorus z370 Gaming 5; i9-9900k; BeQuiet! PureRock 2 Black; 2 x 8GB DDR4-2400; 128GB SATA m.2; MSI 4070 Ti Super Gaming X; MSI 4070 Ti Super Ventus 2; Corsair RM750e

dcn04: Fractal Meshify C; Gigabyte Aorus ax570 Master; Ryzen 9 5950x; BeQuiet! PureRock 2; 2 x 16GB DDR4-3200; 250GB NVMe; Asus RTX 4070 Dual OC; MSI 4070 Ti Super Ventus 2; Corsair TX650m

dcn05: Fractal Define S; Gigabyte Aorus b450m; Ryzen 7 2700; AMD Wraith; 2 x 8GB DDR 4-3200; 128GB SATA NVMe; Gigabyte Gaming RTX 4080 Super; Corsair TX750m

dcn06: Fractal Focus G Mini; Gigabyte Aorus b450m; Ryzen 7 2700; AMD Wraith; 2 x 8GB DDR 4-3200; 128GB SSD; Gigabyte Gaming RTX 4080 Super; Corsair CX650m

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×