Jump to content

Seeking assistance with GTX970 randomly crashing and black screening PC.

Hi LTT forum gurus - long time listener, first time caller.

 

Seeking guidance on troubleshooting my MSI GeForce GTX 970 Gaming GPU randomly crashing. Sometimes within minutes, sometimes I get a few hours. If I start to push the GPU usually within 30-60 seconds a crash occurs.

 

Crashes have only occurred in the past ten days - started early Dec-2019. Really stable for years prior to this. Have made no hardware adjustments or modifications in the last year.

 

PC is still powered on, network continues, sound (MP3) continues, just no video (no audio if from game).

 

On crash GPU still has power and fans are spinning.

 

Have to hard reboot to get the screens back. Depending on crash infrequently EUFI asks to have setting adjusted or Windows 10 boots into recovery.


Any guidance and/or assistance is greatly welcomed.

 

Things that I have tried so far:

  1. Wound back mild GPU overclock.
  2. Upgraded NVIDIA driver to latest 441.66.
  3. DDU driver and reloaded latest directly download from NVIDIA site.
  4. Memtest PC RAM - no issues.
  5. Unigine Heaven benchmark crashed 7/10 benchmark runs.
  6. OCCT (v5.4.2) tests:
    • CPU OCCT (large data) over one hour - no issues.
    • CPU Linpack (2019) over one hour - no issues.
    • GPU 3D without error detection - crash within 45 seconds. See image OCCT note temp is 56 degrees Celsius.
    • GPU 3D with error detection - crash within a minute.
    • GPU Memtest one hour - no issues.
  7. Thought I'd ask the community what I can try next.

N.B.

  • Ambient temp 20-25 degrees Celsius.
  • GPU temps > 60 decrees Celsius.

PC Specs:

  • i7 4790K.
  • 16 GB RAM.
  • MSI GeForce GTX 970 Gaming.
  • Corsair 750W RM750 Gold PSU.

 

Crash scenarios:

  1. Just after boot into windows.
  2. Surfing the net.
  3. Watching YouTube.
  4. Gaming - see image MSI Afterburner of second before crash about 45 seconds into a game - note temp is 39 degrees Celsius.
  5. Idling.
  6. Any and all of the above in combination.

 

Thanks in advance.

TK.
 

 

LTT OCCT GPU 3D test crash.PNG

LTT Afterburner crash.PNG

Link to comment
Share on other sites

Link to post
Share on other sites

How’s the temps of your cpu?

   @Whiro tag or quote will do the trick 
i5 3570K @ 4.7Ghz  |  AsRock Fatal1ty Z77 Performance  |  Corsair Vengeance 16GB 1600MHz  |  ASUS Strix GTX 970 OC  |  Phanteks P400S TG  (mesh panel) |  EVGA 500W1  |  Storage: Corsair 60GB SSD (boot), Gigabyte 120GB SSD, WD 2Tb HDD | Cooling: Custom loop

                EKWB EK-XRES 140 Revo D5 RGB PWM

                EKWB EK Supremacy Evo , naked die

                EKWB EK Thermosphere 

                EKWB EK CoolStream PE 360

                EKWB EK Coolstream SE 120

                EKWB EK Vardar 120s  x6

                EKWB EK STC Classic 10/16  x10

                EKWB EK DuraClear Tubing 16/10

                EKWB EK CryoFuel Acid Green


Laptop: Gigabyte G5-KC | i5 10500H | RTX 3060

                                          WHIRO

         THE FIRST OF DEATH AND DARKNESS

 

        He feast on the dead to inherit their power

Link to comment
Share on other sites

Link to post
Share on other sites

CPU temps generally low.

  • Idle circa 28 degrees.
  • Gaming Minecraft circa 55 degrees.
  • Gaming Doom 2016 circa 60 degrees.
  • Gaming Project Cars circa 60 degrees.
  • OCCT and Linpack pushes CPU circa 65 degrees with specific cores 70-75 degrees with consistent load for over an hour.
Link to comment
Share on other sites

Link to post
Share on other sites

29 minutes ago, timkav said:

CPU temps generally low.

  • Idle circa 28 degrees.
  • Gaming Minecraft circa 55 degrees.
  • Gaming Doom 2016 circa 60 degrees.
  • Gaming Project Cars circa 60 degrees.
  • OCCT and Linpack pushes CPU circa 65 degrees with specific cores 70-75 degrees with consistent load for over an hour.

441.66 drivers was released 10.12 maybe you have problem with drivers. Try rollback to previous drivers. Also make sure your windows is up to date will all updates.

   @Whiro tag or quote will do the trick 
i5 3570K @ 4.7Ghz  |  AsRock Fatal1ty Z77 Performance  |  Corsair Vengeance 16GB 1600MHz  |  ASUS Strix GTX 970 OC  |  Phanteks P400S TG  (mesh panel) |  EVGA 500W1  |  Storage: Corsair 60GB SSD (boot), Gigabyte 120GB SSD, WD 2Tb HDD | Cooling: Custom loop

                EKWB EK-XRES 140 Revo D5 RGB PWM

                EKWB EK Supremacy Evo , naked die

                EKWB EK Thermosphere 

                EKWB EK CoolStream PE 360

                EKWB EK Coolstream SE 120

                EKWB EK Vardar 120s  x6

                EKWB EK STC Classic 10/16  x10

                EKWB EK DuraClear Tubing 16/10

                EKWB EK CryoFuel Acid Green


Laptop: Gigabyte G5-KC | i5 10500H | RTX 3060

                                          WHIRO

         THE FIRST OF DEATH AND DARKNESS

 

        He feast on the dead to inherit their power

Link to comment
Share on other sites

Link to post
Share on other sites

Did you try to stress your system with a different OS in order to rule out hardware-issues?

How old is your PSU?

Link to comment
Share on other sites

Link to post
Share on other sites

Win10 all up to date.

DDU 441.66 and retrograde to NVIDIA driver version 441.20 (22-Nov-19).

Ran OCCT GPU 3D and crashed in two minutes with GPU temp in mid 60s.

Link to comment
Share on other sites

Link to post
Share on other sites

24 minutes ago, Sir0Tek said:

Did you try to stress your system with a different OS in order to rule out hardware-issues?

How old is your PSU?

Haven’t tried a different OS yet. Any recommendations on USB bootable Linux distro? I have Debian on a VirtualBox.


PSU is nearly five years old.

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, timkav said:

Haven’t tried a different OS yet. Any recommendations on USB bootable Linux distro? I have Debian on a VirtualBox.


PSU is nearly five years old.

Having a different OS in a VBox doesn't help since it's within the environment that may be at fault.

I don't know if there's any Live-OS out there with nvidia-drivers already included, but if you;d like to install one I'd suggest Mint (Ubuntu/Debian based, less geeky) or Manjaro. Both are comfortable to install, both do deliver feasible drivers in their repositories.

 

I may not hope that the current lifespan of Corsair-PSUs is just 5 years, or is it?

Link to comment
Share on other sites

Link to post
Share on other sites

17 minutes ago, Sir0Tek said:

Having a different OS in a VBox doesn't help since it's within the environment that may be at fault.

I don't know if there's any Live-OS out there with nvidia-drivers already included, but if you;d like to install one I'd suggest Mint (Ubuntu/Debian based, less geeky) or Manjaro. Both are comfortable to install, both do deliver feasible drivers in their repositories.

 

I may not hope that the current lifespan of Corsair-PSUs is just 5 years, or is it?

I’ll have a go tomorrow with a live Linux USB and see what happens.

 

Will post back then.

 

Thanks. 

Link to comment
Share on other sites

Link to post
Share on other sites

Have been busy, here are the results.

 

  1. Tested hardware on different OS: Loaded Pop!OS onto a USB and ran Unigine Heaven (Ultra) for two hours inc. a benchmark. CPU 57 degrees, max 64. GPU 71 degrees, max 72.
    • Thought issue may lie with Win10. Wiped machine, formatted disk, loaded fresh Win10 and updated security packs.
    • Loaded MSI Afterburner, OCCT, and Unigine Heaven.
    • Was able to benchmark Unigine Heaven and run for 20 minutes. CPU 50 degrees, max 60. GPU 63 degrees, max 64. Lost screens about two minutes post run.
  2. Powered down machine and rebooted. Lost screens on windows log in.
  3. Powered down and rebooted, got new - never seen before - error message before boot sequence “Please power down and connect PCIe power cables to this graphics card.” Check cables to GPU and PSU, nothing loose.
  4. Rebooted and ran OCCT GPU 3D for ten minutes, no screen loss.

 

After all that, I am unsure if I am to trust the GPU and/or the PSU.

 

Thoughts? Different suggestions?

Link to comment
Share on other sites

Link to post
Share on other sites

Basic trouble shooting is in order. 

 

I'd start with trying a different PSU as most likely candidate. 

Then GPU. Then RAM. Then motherboard,  and so on. 

 

Doubt it's software related since you just reinstalled windows. 

And that error message seems to indicate power issue so start with PSU. 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Just finished a OCCT Power test - 30 minutes 100% utilisation for both CPU and GPU.

CPU 81-82 degrees, maxed at 84.

GPU 65 degrees, maxed at 66.

 

Link to comment
Share on other sites

Link to post
Share on other sites

39 minutes ago, Mark Kaine said:

'd start with trying a different PSU as most likely candidate. 

Then GPU. Then RAM. Then motherboard,  and so on. 

 

Doubt it's software related since you just reinstalled windows. 

And that error message seems to indicate power issue so start with PSU. 

Thanks Mark, agree PSU and GPU swaps would be ideal.

 

Don't have another PC in the house that I can switch parts with. Would need to purchase. If I'm going to do that I would prefer to swap out and upgrade. This is when dollars become relevant to what to purchase and in what order.

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 3 months later...

Hey, I'm having literally the same problem as you with this GPU, on the same situations. How did you solve it? Thanks 

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/22/2020 at 11:19 AM, pu1uep said:

Hey, I'm having literally the same problem as you with this GPU, on the same situations. How did you solve it? Thanks 

Hi pu1uep,

 

TL;DR: Purchased new graphics card for reliability.

 

 

Troubleshooting in the last post was down to hardware.

 

PSU has been ruled out as an issue (even though it is ageing). Trying different hardware has shown no issue with PSU.

 

GPU I got some stability by giving the card a through clean - without pulling the card apart. Still had intermittent issues. Thinking about reapplying thermal paste etc. have not yet actioned. Purchased a new card to get up and running while I troubleshoot.

 

Intending to build a second PC with the old parts as a test environment/backup.

 

Not sure if this information helps.

 

Let me know if you have further specific questions.

 

TK.

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/23/2020 at 12:57 AM, timkav said:

Hi pu1uep,

 

TL;DR: Purchased new graphics card for reliability.

 

 

Troubleshooting in the last post was down to hardware.

 

PSU has been ruled out as an issue (even though it is ageing). Trying different hardware has shown no issue with PSU.

 

GPU I got some stability by giving the card a through clean - without pulling the card apart. Still had intermittent issues. Thinking about reapplying thermal paste etc. have not yet actioned. Purchased a new card to get up and running while I troubleshoot.

 

Intending to build a second PC with the old parts as a test environment/backup.

 

Not sure if this information helps.

 

Let me know if you have further specific questions.

 

TK.

 

Hey man, thank you for the response! I'm not sure what's the problem here yet, even though my PC crashes in the same situations as yours, and it happens like this: Usually it starts lagging a lot (like a huge FPS drop) and then it restarts by itself. Whenever I run OCCT memtest, the PC starts lagging the same way as it does when I have this issue, and that's why I'm suspecting of a graphics card VRAM problem. But since I'm on a budget, I really didn't want to spend $$$ on a new graphics and in the end it isn't the real problem. What do you think it might be? Was your scenario similar to what I described? Probably the graphics, huh? Again, thanks for your time! 

Link to comment
Share on other sites

Link to post
Share on other sites

On 4/25/2020 at 11:25 AM, pu1uep said:

Hey man, thank you for the response! I'm not sure what's the problem here yet, even though my PC crashes in the same situations as yours, and it happens like this: Usually it starts lagging a lot (like a huge FPS drop) and then it restarts by itself. Whenever I run OCCT memtest, the PC starts lagging the same way as it does when I have this issue, and that's why I'm suspecting of a graphics card VRAM problem. But since I'm on a budget, I really didn't want to spend $$$ on a new graphics and in the end it isn't the real problem. What do you think it might be? Was your scenario similar to what I described? Probably the graphics, huh? Again, thanks for your time! 

I think you are on to a GPU issue if OCCT memtest is surfacing the issue. Faulty hardware is well beyond my area of capability. May be worth hunting through the forum for potential stop-gap solutions.

 

If your issue is VRAM then a replacement card sounds likely.

 

Good luck.

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, timkav said:

I think you are on to a GPU issue if OCCT memtest is surfacing the issue. Faulty hardware is well beyond my area of capability. May be worth hunting through the forum for potential stop-gap solutions.

 

If your issue is VRAM then a replacement card sounds likely.

 

Good luck.

Thanks for you help brother. It was my first time writting in forums, so sorry if I did something wrong like asking you about my problem in your topic. 

Link to comment
Share on other sites

Link to post
Share on other sites

  • 1 month later...

I have the same problem. This is new to my ancient 970. I think it may be a Nvidia driver issues. You may want to get the newest driver or even use a driver that is 6 or 7 months old. See if that fixes the problem. 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×