Jump to content

BSOD - Nvidia GPU Sporadically Causing Instant PC Freeze

JasmeowTheCat
Go to solution Solved by JasmeowTheCat,

Honestly, I just gave up in the end and brought a new PC as it was time for a new desktop anyway. I will try and buy a new GPU for that system and sell the parts, or use it as an old workstation if I require a quick PC where I cannot do my work.

 

Honestly, with the lack of replies on here, I have felt really not confident going in and actually trying to see what I can do with the GPU. I am pretty annoyed that this fault started happening and I really don't have the energy to try and properly diagnose it due to being on this new PC now. 

 

Appreciate the help anyway.

Hi All,

 

Hope you are well. It's time to reach out to the experts to see what they can do to help out and solve this issue once and for all. I have had this system for about 3 years now, doing a light upgrade of the hardware but sadly I am now getting BSOD's periodically, but it's really hard and extensive to explain what has been happening so please read on.

 

This is going to be quite the extensive post to provide as much detail as possible to hopefully find the solution as quickly as possible, as I work self employed so it's impacting my business with my system just doing this on me unfortunately.

 

In simple terms, I know it's either:

 

The GPU

  • Either with the drivers from Nvidia, the 536.99 version previously installed or the 537.42 version installed currently.
  • The actual hardware failing itself - Due to age, due to the dust clear out and possibly nudged or shorted it.
  • I have NOT run a GPU test.

The RAM

  • The new RAM modules I have installed causing conflicts with the timeout of windows, causing a fault where the system halts.
  • Shorting something due to not being grounded while I was dusting out the system.
  • The BIOS update on my motherboard prior to the memory arriving. This is obviously to prepare for the new upgraded RAM so I needed the BIOS update to get the new settings for XMP II instead of XMP I or Auto.
  • Tying into the above: The wrong BIOS configuration for the RAM itself. Perhaps i've set something that Windows does not like or cannot communicate with the GPU effectively.
  • Not seated properly - BUT not really possible as windows detects all 32GB with no issues.
  • I have NOT run a memory test.

 

Spoiler

SYSTEM INFORMATION

  • OS: Windows 10 - Version 22H2 (OS Build 19045.3448)
  • Bit: x64
  • What OS Previously: No OS previously, Windows 10.
  • License: Retail/Volume
  • Age: 3 Years
  • Age Of OS: Original Install Date: 02/08/2020, 15:35:34 - No reinstalls done.
  • CPU model: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz, 2904 Mhz, 6 Core(s), 6 Logical Processor(s)
  • Video Card model: NVIDIA GeForce RTX 2070 SUPER
  • MotherBoard: ROG STRIX Z390-F GAMING
  • Power Supply: Cannot Obtain Easily
  • System Manufacturer: Custom
  • Exact model number: N/A
  • L/D: Desktop
  • Screens: 3 - One on the left and middle normal orientation, then one on the right vertial. AOC monitors 21.5'.

 

Spoiler

STORY OF EVENTS

15th September - RAM ordered from Amazon. Corsair VENGEANCE RGB PRO DDR4 RAM 32GB (2x16GB) 3600MHz CL18.
 

16th September - RAM arrived and installed into desktop after dusting it inside and out after being all built up over 3 years. Must admit it looks good as new!

 

18th September - First "Freeze", and my mouse wasn't able to be moved. No windows could be interacted, no output from the system in terms of sound, but I was watching a Youtube video when suddenly the sound distorted/slowed down then played the speaker white noise (The computery glitchy sound not the static white noise we know from TV's, kinda like this video: https://www.youtube.com/watch?v=_KlB2kFD3uU) for about 3 seconds then no sound after that. Monitors still on and displaying normal colour at this stage, but no motion or interaction. It's like a "picture" was stored of the last frame of the GPU.

After about 3-4 minutes, my vertical monitor turned into a horror film, see image: https://jasmeow.pics/DzG67R.png (Was talking to my best friend at the time, ignore the GIF in the chat lmfao) but my middle and left screens remained fine. I am presuming this is due to the GPU slowly going to render each monitor and a frame after a LONG while, but I didn't wait that long.

 

Once that appeared on my monitor, about a minute later, I held the power button (thinking it would be long enough for a BSOD to dump or do what Windows needed to do) and got the system off. Pressing it back on was all fine and I thought nothing of it as I recently updated the Nvidia driver from 536.99 version to the 537.42 version, so I rolled back to 536.99 and went from there. Didn't think it was a RAM fault or something shorting at the time.

 

20th September - Happened again, same thing happened with the mouse "Freezing" all screens locking up but still displaying picture, so I knew something was up. Waited about 5 minutes and it caused Discord to do the same issue again of a horror movie lol, so I held power again and got the system off. After getting back online, I downloaded a safe tool (DDU, Display Driver Uninstaller) to uninstall all drivers and reinstall the latest 537.42 driver to start from scratch. I checked and made sure Windows was updated, SFC'ed and repaired files it found successfully, DISM'ed to check health with no issues to report and investigated into Event Viewer.

 

22nd September - Just happened again but not a huge deal, I just shut off the system by holding the power button after about 2-3 minutes and game here to write up a post. I tried Ctrl Alt Del and a bunch of other things, no pictures changed, just completely "Frozen" again with the mouse while watching another YT Video.

 

Note: I know I stated I was watching YT on the two of the three BSOD's, but I also had Minecraft and Discord open with Hardware Acceleration turned on on both ocassions, so I presume all that "build up" caused it to commit die.

 

Spoiler

18th AND 20th September, both DPC Watchdogs, presuming this is due to the driver/GPU not responding in time and windows halting the process.

EjY1Q5.png

zXpoUB.png

 

9vToXU.png

 

Spoiler

22nd September. Luckily this is when I managed to get the BSOD code (I know I can look in event viewer ha!) but I saw it happen on my left screen.
My middle monitor went a shade of green for about 2 seconds, all screens went blank for about 3 seconds, then on the left screen turned on and got the BSOD.

CJQk2O.png

5jrqFL.png

 

Spoiler

24th September, just now. Ideally I don't wish to count this as a memory issue even though it states "MEMORY_MANAGEMENT". The RAM is brand new!

KOCnhc.png

W5n6Fy.png

 

Spoiler

I did find and locate something interesting that at about 3 minutes before on one occasion and 10 minutes before the BSOD's on both occasions, NVLD starts complaining.

eSgwh9.png

 

1Kremg.png

 

Amazon Order of 32GB RAM:

z36ong.png

 

I cannot seem to get the RPM info due to it sitting on this for about 30 minutes now with no way of saving, so the 60 seconds collection timer is nonsense, sorry I couldn't get this for you.

XlCe8n.png

 

MY NEXT STEPS
My next steps are the following...

Do I either:

  • A) Put the old RAM back in and set the BIOS back to what the configuration was and see if it sporadically happens again to prove it's not a RAM fault? With these BSOD's happening every 2 days, it's safe to say that if I do try this I will wait a week and see if it happens again as that will give me enough time to trial and error.
  • B) If it happens again, roll back the BIOS version (I have ZERO idea what it was previously unless it's stored on the flash it has) and see how it goes and see if it happens again? This option might not be viable.
  • C) Move the GPU to the 2nd PCI slot on the motherboard and see if it still does the same thing, basically "a little move" to see if it occurs again?

 

CPU has no integrated graphics so kinda screwed without GPU. Everythings updated, so I am not sure what I should do with updates and drivers.

 

Any tips welcome. Thank you for reading. I have uploaded and included my DMP files from Windows. https://jasmeow.pics/DOQLtV.zip

 

I read through them on WinDbg and they gave me the same BugCheck codes.

Link to comment
Share on other sites

Link to post
Share on other sites

Unfortunate news is that it happened again with no modifications yet as I want to see what you guys suggest before I do any major changes to my system again.

 

I managed to catch it BlueScreen again as everything froze but I could hear audio for about 20 seconds before all screens went blank, then the left monitor showed BSOD.

 

Bugcheck: VIDEO TDR FAILURE

 

Ni0LGl.png

 

MemoryDMP File - https://drive.google.com/file/d/1jBmjUnZDeUmoGuGuvNH3HYtBvqD2sVR0/view?usp=drive_link

Link to comment
Share on other sites

Link to post
Share on other sites

So it occured again just now about an hour and 30 minutes after the previous post, so I have updated the BIOS from 1802, which it the latest it could obtain from the internet, to 2004, which is the latest on the website forthe ROG Strix Z390-F Gaming motherboard. I will keep you posted if it happens again after then.

 

The BSOD for this one was again DPC WATCHDOG VIOLATION.

3bFlmL.png

Link to comment
Share on other sites

Link to post
Share on other sites

All the dump files point to the Nvidia driver. DPC_Watchdog_Violation is often drivers, but the Video_TDR_Failure is often a faulty GPU. If you haven't tried DDU I would do so. I have seen some instances of this happening with overheating so check the GPU core and memory temps, but there are no sensors for the VRMs. Opening it and cleaning it (Replacing the thermal paste, and pads if it has pads) could fix the issue. 

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, Bjoolz said:

All the dump files point to the Nvidia driver. DPC_Watchdog_Violation is often drivers, but the Video_TDR_Failure is often a faulty GPU. If you haven't tried DDU I would do so. I have seen some instances of this happening with overheating so check the GPU core and memory temps, but there are no sensors for the VRMs. Opening it and cleaning it (Replacing the thermal paste, and pads if it has pads) could fix the issue. 

Hey Bjoolz,

 

Thank you for the advice on what the errors mean. As stated in the original first post, I did DDU completely and then reinstall the completel latest driver. I have updated the motherboard from 1802 to 2004 and it hasn't happened yet at time of writing and I am doing the exact same things I was doing yesterday so we will see what happens again.

 

As suspected, if it is a faulty GPU, it's probably because I cleared out all the dust from the PC and perhaps shorted something in the process causing something that gets used on the GPU to stop and then crash the entire thing, like memory chips, uses part 7 or something and then it faults. Ill see how it performs after this MB update as I want to trial and error one thing at a time.

 

Regarding the thermal paste and pads, I have zero idea how to do that so ill look it up and work out what needs doing haha!

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, JasmeowTheCat said:

Thank you for the advice on what the errors mean. As stated in the original first post, I did DDU completely and then reinstall the completel latest driver. I have updated the motherboard from 1802 to 2004 and it hasn't happened yet at time of writing and I am doing the exact same things I was doing yesterday so we will see what happens again.

Sorry, I just read this post. 

 

2 hours ago, JasmeowTheCat said:

Regarding the thermal paste and pads, I have zero idea how to do that so ill look it up and work out what needs doing haha!

Not that many cards have pads for the memory and VRMs, most just cool the core. Watching a teardown video beforehand so you know what you need is definitely recommended. 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...

Honestly, I just gave up in the end and brought a new PC as it was time for a new desktop anyway. I will try and buy a new GPU for that system and sell the parts, or use it as an old workstation if I require a quick PC where I cannot do my work.

 

Honestly, with the lack of replies on here, I have felt really not confident going in and actually trying to see what I can do with the GPU. I am pretty annoyed that this fault started happening and I really don't have the energy to try and properly diagnose it due to being on this new PC now. 

 

Appreciate the help anyway.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×