Jump to content

Consequences of undervolting (other than crashing and artifacts for GPUs)

Hello,

I am quite familiar with undervolting in the context of gaming and I undervolt my CPU using throttlestop and MSI Afterburner for GPU regularly. In fact due to horrible design of my ASUS ROG GL702VSK with i7-7700HQ and Nvidia GTX1070, without undervolting the GPU, the computer crashes due to power shortage on intense extensive load.

 

Before you say thermals, I do know what I am talking about. The crashes are not due to temperature, it's due to power load, PSU can only supply 230W by design, and on my AW15 or Lenovo W530/T430 the battery drained when the load was exceeding the what PSU can supply, but on my laptop, battery does not drain on excessive power usage, the laptop just crashes.

 

Anyways, undervolting is necessary for my gaming laptop, but I also do it on my work laptops and travel/mobile laptops simply because the computer runs cooler and battery lasts longer.

 

Now, the fun part. I've been programming in C for a little over a decade now and I draft my code on my laptop sometimes while the actual work horse is my desktop. I have a suspicion that undervolting is causing some funky undefined behaviour in a set of code that compiles and works perfectly fine on a dedicated desktop (with the same compiler and package versions).

 

Just out of curiosity, the code yields different results with and without throttlestop (and undervolt) engaged. I don't see any definite hits on web search on the topic. Similarly there are undervolting configurations that works perfectly fine for gaming but cripples cuda programming (very oddly timed racing condition).

 

Has anyone else experienced non-crash/non-freeze (or artifact generation on screen) consequences that is reproducible from undervolting?

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

the problem here is your psu not being adequate (if your observations are correct)  and undervolting not guaranteed to work on every system/ components. 

 

ie if your getting artifacts you'll likely need more voltage,  if your psu cannot provide that --> you need a better psu or less power (voltage)  hungry components.

 

hope this helps?  since im not entirely sure what you're asking. 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

21 minutes ago, SJLPHI said:

Has anyone else experienced non-crash/non-freeze (or artifact generation on screen) consequences that is reproducible from undervolting?

I don't, but mostly because I could never be arsed to undervolt any of my current systems.

 

Anyhow, it just shows that your undervolt isn't stable. Even if it's not causing your system to crash, it is causing it to spew wrong results, try to increase voltage a bit until you stop noticing such errors.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, Mark Kaine said:

the problem here is your psu not being adequate (if your observations are correct)  and undervolting not guaranteed to work on every system/ components. 

 

ie if your getting artifacts you'll likely need more voltage,  if your psu cannot provide that --> you need a better psu or less power (voltage)  hungry components.

 

hope this helps?  since im not entirely sure what you're asking. 

you are stuck on my gaming computer, I don't program on this, irrelevant to the post. I was using it to demonstrate how I undervolt regularly.

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, igormp said:

I don't, but mostly because I could never be arsed to undervolt any of my current systems.

 

Anyhow, it just shows that your undervolt isn't stable. Even if it's not causing your system to crash, it is causing it to spew wrong results, try to increase voltage a bit until you stop noticing such errors.

Yes, that's exactly what I figure, but I wish to figure out a good way to find out if my current configuration is non-crashing unstable undervolt state.

 

Consequences are quite nasty especially because it isn't obvious.

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

Is the program just slowing down like crazy? That is a phenomena known as clock stretching, it can happen depending on how undervolting is done. Clock speeds will remain high, but performance tanks out of nowhere. Ryzen CPUs do it a ton if you set a negative voltage offset, and Intel chips can do the same (albeit not as commonly).

 

If it's giving different results though, that just sounds like it's unstable, which for a CPU overclock I've seen happen a couple times.

Edited by RONOTHAN##
Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, RONOTHAN## said:

Is the program just slowing down like crazy? That is a phenomena known as clock stretching, it can happen depending on how undervolting is done. Clock speeds will remain high, but performance tanks out of nowhere. Ryzen CPUs do it a ton if you set a negative voltage offset, and Intel chips can do the same (albeit not as commonly).

In some of the cases, yes, I think it's clock stretching.

 

Depends, when it comes to CUDA for example, the racing forces the operation to be stupidly slow because some of the cores fail to timely return a result, and the whole GPU is at high clock speed but almost nothing happens.

 

For CPU, I've seen this behaviour only 2~3 times, CPU usage is at 100%, and running at ~4GHz on turbo, but yields stupidly low amount of results.

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

30 minutes ago, SJLPHI said:

n fact due to horrible design of my ASUS ROG GL702VSK with i7-7700HQ and Nvidia GTX1070, without undervolting the GPU, the computer crashes due to power shortage on intense extensive load.

I think your issue lies somewhere else... On an adjacent model with the same specs (https://www.notebookcheck.net/Asus-Strix-GL702VS-7700HQ-FHD-GTX-1070-Xotic-PC-Edition-Notebook-Review.207150.0.html - A GL702VS instead of GL702VSK): "Power consumption is to be expected with this level of hardware. The GL702VS idles at about 23 Watts and caps out at 188 Watts. The included 230 Watt power supply provides about 20% more power than the device pulls under load, giving the notebook a sufficient amount of headroom."

Do you have something like a Kill-A-Watt (measures wattage/current pulled from the wall) that can confirm it's trying to pull more wattage than the PSU can supply just before it cuts off? 
 

Outside of that, as everyone else said the next likely culprit is just normal instability. You can have an unstable OC or undervolt without the system consistently BSODing. 

Intel HEDT and Server platform enthusiasts: Intel HEDT Xeon/i7 Megathread 

 

Main PC 

CPU: i9 7980XE @4.5GHz/1.22v/-2 AVX offset 

Cooler: EKWB Supremacy Block - custom loop w/360mm +280mm rads 

Motherboard: EVGA X299 Dark 

RAM:4x8GB HyperX Predator DDR4 @3200Mhz CL16 

GPU: Nvidia FE 2060 Super/Corsair HydroX 2070 FE block 

Storage:  1TB MP34 + 1TB 970 Evo + 500GB Atom30 + 250GB 960 Evo 

Optical Drives: LG WH14NS40 

PSU: EVGA 1600W T2 

Case & Fans: Corsair 750D Airflow - 3x Noctua iPPC NF-F12 + 4x Noctua iPPC NF-A14 PWM 

OS: Windows 11

 

Display: LG 27UK650-W (4K 60Hz IPS panel)

Mouse: EVGA X17

Keyboard: Corsair K55 RGB

 

Mobile/Work Devices: 2020 M1 MacBook Air (work computer) - iPhone 13 Pro Max - Apple Watch S3

 

Other Misc Devices: iPod Video (Gen 5.5E, 128GB SD card swap, running Rockbox), Nintendo Switch

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Zando_ said:

I think your issue lies somewhere else... On an adjacent model with the same specs (https://www.notebookcheck.net/Asus-Strix-GL702VS-7700HQ-FHD-GTX-1070-Xotic-PC-Edition-Notebook-Review.207150.0.html - A GL702VS instead of GL702VSK): "Power consumption is to be expected with this level of hardware. The GL702VS idles at about 23 Watts and caps out at 188 Watts. The included 230 Watt power supply provides about 20% more power than the device pulls under load, giving the notebook a sufficient amount of headroom."

Do you have something like a Kill-A-Watt (measures wattage/current pulled from the wall) that can confirm it's trying to pull more wattage than the PSU can supply just before it cuts off? 
 

Outside of that, as everyone else said the next likely culprit is just normal instability. You can have an unstable OC or undervolt without the system consistently BSODing. 

Let me mention again, this thread has nothing to do with my ASUS ROG, please read the full post.

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, SJLPHI said:

Let me mention again, this thread has nothing to do with my ASUS ROG, please read the full post.

Ohhh... It's very confusing to name the laptop you don't have issues with, but then not name the one you are having issues with (as far as I can tell, the two other named laptops are past ones? Or are they current, and the work laptops you mention later?). 

If I am reading it correctly now: yeah, unstable undervolt. Turn voltage up until the laptop behaves normally again, or just disable the undervolt when compiling/rendering anything. 

Intel HEDT and Server platform enthusiasts: Intel HEDT Xeon/i7 Megathread 

 

Main PC 

CPU: i9 7980XE @4.5GHz/1.22v/-2 AVX offset 

Cooler: EKWB Supremacy Block - custom loop w/360mm +280mm rads 

Motherboard: EVGA X299 Dark 

RAM:4x8GB HyperX Predator DDR4 @3200Mhz CL16 

GPU: Nvidia FE 2060 Super/Corsair HydroX 2070 FE block 

Storage:  1TB MP34 + 1TB 970 Evo + 500GB Atom30 + 250GB 960 Evo 

Optical Drives: LG WH14NS40 

PSU: EVGA 1600W T2 

Case & Fans: Corsair 750D Airflow - 3x Noctua iPPC NF-F12 + 4x Noctua iPPC NF-A14 PWM 

OS: Windows 11

 

Display: LG 27UK650-W (4K 60Hz IPS panel)

Mouse: EVGA X17

Keyboard: Corsair K55 RGB

 

Mobile/Work Devices: 2020 M1 MacBook Air (work computer) - iPhone 13 Pro Max - Apple Watch S3

 

Other Misc Devices: iPod Video (Gen 5.5E, 128GB SD card swap, running Rockbox), Nintendo Switch

Link to comment
Share on other sites

Link to post
Share on other sites

18 minutes ago, SJLPHI said:

Yes, that's exactly what I figure, but I wish to figure out a good way to find out if my current configuration is non-crashing unstable undervolt state.

 

Consequences are quite nasty especially because it isn't obvious.

Seems like you already found a good benchmark for you. Keep bumping up the voltage until you stop getting such undefined behaviours.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, igormp said:

Seems like you already found a good benchmark for you. Keep bumping up the voltage until you stop getting such undefined behaviours.

Well... it's not a reliable benchmark.

 

So, there's a set a codes I kind of use as benchmark:

1. Parallel random number generating, if any null = unstable.

2. Same with fixed seed, repeated 10 times, calculate stats on them and if they are different = unstable.

3. memory alloc/dealloc back-and-forth and look for runaway "RAMshark" memory leak.

 

what worries me very much is that one configuration would pass all 3 10 times then on the 11th time it fails, then I crank up 10mV and hope for the best...

 

Worst part is that I don't have a good benchmark because it may even work 100 times and on the 101st time screw up... 

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, Zando_ said:

Ohhh... It's very confusing to name the laptop you don't have issues with, but then not name the one you are having issues with (as far as I can tell, the two other named laptops are past ones? Or are they current, and the work laptops you mention later?). 

If I am reading it correctly now: yeah, unstable undervolt. Turn voltage up until the laptop behaves normally again, or just disable the undervolt when compiling/rendering anything. 

The ASUS laptop is my dedicated entertainment laptop, it's strictly for gaming. As for my work laptops (a handful of Lenovos), I don't currently have an issue but I see the undefined behaviour during "unstable undervolt" without crash/freeze. I want to know what else can be anticipated from unstable undervolting and better yet, a way to benchmark it because even in these unstable undervolting, it performs perfect on stress tests all of the time.

 

Right now, what I do is just draft my codes on the laptops and send it to my non-undervolted work desktop to do the job properly and if I somehow realize that something isn't right on the undervolted laptop, I crack up 10mV and hope for the best, but is that enough? I don't know!

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, SJLPHI said:

I don't currently have an issue but I see the undefined behaviour during "unstable undervolt" without crash/freeze.

If your laptop is being weird, and the only change you made was undervolt, then yeah you have an issue with the undervolt. 

2 minutes ago, SJLPHI said:

I crack up 10mV and hope for the best, but is that enough?

Probably. Keep doing that till you stop seeing strange behavior. Or disable it completely, make sure it isn't behaving strangely at stock settings (that would denote an issue somewhere else), and then if it's fine start to slowly push it back down until it starts being strange again. 

Personally I would just leave it stock, I don't like having to troubleshoot machines I want to use for work. 

Intel HEDT and Server platform enthusiasts: Intel HEDT Xeon/i7 Megathread 

 

Main PC 

CPU: i9 7980XE @4.5GHz/1.22v/-2 AVX offset 

Cooler: EKWB Supremacy Block - custom loop w/360mm +280mm rads 

Motherboard: EVGA X299 Dark 

RAM:4x8GB HyperX Predator DDR4 @3200Mhz CL16 

GPU: Nvidia FE 2060 Super/Corsair HydroX 2070 FE block 

Storage:  1TB MP34 + 1TB 970 Evo + 500GB Atom30 + 250GB 960 Evo 

Optical Drives: LG WH14NS40 

PSU: EVGA 1600W T2 

Case & Fans: Corsair 750D Airflow - 3x Noctua iPPC NF-F12 + 4x Noctua iPPC NF-A14 PWM 

OS: Windows 11

 

Display: LG 27UK650-W (4K 60Hz IPS panel)

Mouse: EVGA X17

Keyboard: Corsair K55 RGB

 

Mobile/Work Devices: 2020 M1 MacBook Air (work computer) - iPhone 13 Pro Max - Apple Watch S3

 

Other Misc Devices: iPod Video (Gen 5.5E, 128GB SD card swap, running Rockbox), Nintendo Switch

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Zando_ said:

If your laptop is being weird, and the only change you made was undervolt, then yeah you have an issue with the undervolt. 

Probably. Keep doing that till you stop seeing strange behavior. Or disable it completely, make sure it isn't behaving strangely at stock settings (that would denote an issue somewhere else), and then if it's fine start to slowly push it back down until it starts being strange again. 

Personally I would just leave it stock, I don't like having to troubleshoot machines I want to use for work. 

Yeah, all of my critical machines are left stock just for safety.

 

The worst part is that the strange behaviour is virtually undetectable unless I do very specific things with the machine and there's a chunk of unknown unknowns I don't want to find out the hard way.

 

There's very little documentation on the undefined behaviours of the undervolting and testing and I was hoping to find people who have some good ideas on what to expect and how to test it, because I do really want to benefit from undervolting on all of my machines.

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, SJLPHI said:

The worst part is that the strange behaviour is virtually undetectable unless I do very specific things with the machine and there's a chunk of unknown unknowns I don't want to find out the hard way.

Yes. It's the same reason manually tuning RAM can be torture, it can cause all sorts of really weird issues without actually crashing the machine. 

2 minutes ago, SJLPHI said:

I was hoping to find people who have some good ideas on what to expect and how to test it, because I do really want to benefit from undervolting on all of my machines.

As said. Revert to stock -> verify it's fine. If so, go back to your undervolt -> raise voltage slightly. If it's weird again then raise voltage further. Keep doing that till it stops being weird (or you're back to stock voltage).

 

Prime95 SmallFFT is the best for an AVX burst load to test initial stability, if it can pass 15 minutes on that it's probably good, if you can leave it running the Blend preset overnight (or at least a good few hours) and it passes that with no errors, then it's even more probably stable. Worth noting that both of these will also slam the CPU to very high temps, usually ones you can't replicate with actual real world load (trying to open a massive sample asset in Unreal Engine 4 is the only task I have ever seen push my CPU as hard as Prime95 does). 

Intel HEDT and Server platform enthusiasts: Intel HEDT Xeon/i7 Megathread 

 

Main PC 

CPU: i9 7980XE @4.5GHz/1.22v/-2 AVX offset 

Cooler: EKWB Supremacy Block - custom loop w/360mm +280mm rads 

Motherboard: EVGA X299 Dark 

RAM:4x8GB HyperX Predator DDR4 @3200Mhz CL16 

GPU: Nvidia FE 2060 Super/Corsair HydroX 2070 FE block 

Storage:  1TB MP34 + 1TB 970 Evo + 500GB Atom30 + 250GB 960 Evo 

Optical Drives: LG WH14NS40 

PSU: EVGA 1600W T2 

Case & Fans: Corsair 750D Airflow - 3x Noctua iPPC NF-F12 + 4x Noctua iPPC NF-A14 PWM 

OS: Windows 11

 

Display: LG 27UK650-W (4K 60Hz IPS panel)

Mouse: EVGA X17

Keyboard: Corsair K55 RGB

 

Mobile/Work Devices: 2020 M1 MacBook Air (work computer) - iPhone 13 Pro Max - Apple Watch S3

 

Other Misc Devices: iPod Video (Gen 5.5E, 128GB SD card swap, running Rockbox), Nintendo Switch

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, SJLPHI said:

Well... it's not a reliable benchmark.

 

So, there's a set a codes I kind of use as benchmark:

1. Parallel random number generating, if any null = unstable.

2. Same with fixed seed, repeated 10 times, calculate stats on them and if they are different = unstable.

3. memory alloc/dealloc back-and-forth and look for runaway "RAMshark" memory leak.

 

what worries me very much is that one configuration would pass all 3 10 times then on the 11th time it fails, then I crank up 10mV and hope for the best...

 

Worst part is that I don't have a good benchmark because it may even work 100 times and on the 101st time screw up... 

That's the problem with running stuff out of spec, it's not guaranteed to work. You could try prime or some other benchmarks too if you want, or just go back to stock and slowly keep decreasing the voltage (let's say, 10mV every week or so) until you notice errors again.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, igormp said:

That's the problem with running stuff out of spec, it's not guaranteed to work. You could try prime or some other benchmarks too if you want, or just go back to stock and slowly keep decreasing the voltage (let's say, 10mV every week or so) until you notice errors again.

Yeah, I was hoping if you guys had some standard tricks. In these states, the computer handles standard stress tests fine, like the prime number generator and built-in benchmarking tools from XTU and Throttlestop, but at some point something like opening a start menu becomes staggeringly slow or it could be one of my codes... everytime I see something like that, I crank up 10mV just to be safe. Especially on the Lenovos, doing anything to lower the temperature is big for me.

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 3 weeks later...

I just noticed on my primary laptop with xeon mobile CPU with ECC RAM, while booted on W10 file explorer crashes and things got laggy until a reboot. In addition some of the files I cut an paste were deleted but not pasted and I had to remake them and I am noticing very brief moments of the 4k built in laptop flickering vertical lines, if I am not mistake it resembles GPU artifacts. These problem goes away after reboot and I've been using the laptop to write some technical documents that span up to 100MB using LaTex for about 12 hours in a day.

 

Just in case, I am going to crank up another +10mV and I am hoping that this is due to undervolting. 

 

Weirdest part is that load isn't even "That" high. I have put higher loads for longer periods on current setting.

 

Would you suspect anything else if you were me?  Do you think it's worth lookig into my SSDs and the monitor?

 

For Windows 10 DE I am using when I see these are on WD Blue SN 550 2TB as a boot drive and the data I am working with is on a WD Red 2.5" SA500 2TB SSD, and the botched data transfer goes from the SN550 to SA500.

 

After I finish up the current worload, I can check if I see a problem on the linux side, better yet I am considering just disable UV altogether.

For the love of God, if you do not have the attention span to read a full post, take a rest instead of wasting both of our time replying to irrelevant context within the post.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×