Jump to content

GPU driver crashes a lot lately

Go to solution Solved by HeLiOn,

I can finally rule out this being a hardware issue (and sleep better at night).
Ever since I swapped the GPUs, I haven't had a single incident on any of the rigs.
So yesterday. I finally swapped them back, and so far, everything has been going fine.

Since these stopped happening after the first swap, I think I'll just pass it on the GPU being somehow improperly seated.
One of the PCI connections must have been broken, I don't know...
I'll post if I find something else, but for now I'll just enjoy the smooth sailing.
Thank you for the help, guys...

Hello.
I have a Gainward GeForce 4080 Super JetStream OC running on an Asus TUF GAMING B650-Plus.
For about a week now, the driver has started to randomly crash. The screens go black, and the fans start going at max speed.
At first I thought it was a driver issue, so I reinstalled the driver. It didn't solve it.
2 Days ago it crashed 5 or 6 times, so I tried removing the driver completely using DDU, and installing an older driver version from December (566.36).
That didn't solve it either. Yesterday it also went out like 5 or 6 times.
Then I booted my PC on Linux to see if it's an OS issue. Fired up Metro Exodus, but after the first game restart (after doing the settings) the driver crashed again.

I'm starting to have cold sweats here, cause if this is a hardware issue, and I have to send it back, the store will just give me my money back, and I won't find a 4080 Super again.
Is there anything else I can try here, before giving up on this card completely?
if it's of any relevance, I'm not using HDR or any form of variable refresh rate.

 

Link to comment
https://linustechtips.com/topic/1605376-gpu-driver-crashes-a-lot-lately/
Share on other sites

Link to post
Share on other sites

You could try the 4080 in another pc. Bring it over to a friend to test if you don't have a spare pc.

Also test another gpu in your pc if you can.

I usually edit my posts.

Refresh the page before answering to my post.

Link to post
Share on other sites

46 minutes ago, Mark Kaine said:

How do you know it's the driver? or even the gpu at all?

 

Full specs ?

They're hidden in my signature
image.png.5a739f8b3212854770d8b6f476c63295.png

Link to post
Share on other sites

1 minute ago, HeLiOn said:

They're hidden in my signature
image.png.5a739f8b3212854770d8b6f476c63295.png

Ok, I see. Have you tried no expo/XMP? 

 

And, again I gotta ask.

1 hour ago, HeLiOn said:

the driver crashed again

What's the error message?

The direction tells you... the direction

-Scott Manley, 2021

 

 

Link to post
Share on other sites

1 hour ago, Mark Kaine said:

Ok, I see. Have you tried no expo/XMP? 

I got that suggestion from discord earlier.
I did check the XMP profile and it was already disabled.
I don't know for how long it has been like this, so for now I enabled it and I'll see if I get anymore crashes.

 

1 hour ago, Mark Kaine said:

And, again I gotta ask.

What's the error message?

I don't know of any error message. When the crash occurs, my screens go black (they stop receiving input), and the fan speed maxes out.
If there is any error message, I'm not able to see it.

Link to post
Share on other sites

1 hour ago, Tan3l6 said:

How about the PSU?

It's a Be Quiet Dark Power Pro 12, 80+ Titanium, 1200W.
I'll add it to the specs.

2 hours ago, Mumintroll said:

You could try the 4080 in another pc. Bring it over to a friend to test if you don't have a spare pc.

Also test another gpu in your pc if you can.

If everything else fails, I'll try that.

Link to post
Share on other sites

2 hours ago, Mark Kaine said:

Ok, I see. Have you tried no expo/XMP? 

I can now confirm that this happens regardless of whether XMP is active or not.

Link to post
Share on other sites

2 hours ago, HeLiOn said:

I don't know of any error message. When the crash occurs, my screens go black (they stop receiving input), and the fan speed maxes out.
If there is any error message, I'm not able to see it.

That's why I'm asking because you can't know at all if it's the GPU if there isn't a specific error message and even then it's not always clear...

From your description this is either RAM, motherboard (BIOS) or some kind of software issue.

 

Is latest BIOS installed?

Did you try DDU?

ran memtest64?

 

Also check event viewer for "critical" errors.

Especially WHEA errors but also everything else can be important if it's "critical".

 

 

The direction tells you... the direction

-Scott Manley, 2021

 

 

Link to post
Share on other sites

3 hours ago, Mark Kaine said:

That's why I'm asking because you can't know at all if it's the GPU if there isn't a specific error message and even then it's not always clear...

From your description this is either RAM, motherboard (BIOS) or some kind of software issue.

 

Is latest BIOS installed?

Did you try DDU?

ran memtest64?

This thing occurs on both Linux and Windows.
That's why I assumed it's either a hardware issue, or something wrong at a driver level.
Or some weird stuff between the motherboard and the GPU...

I have the latest bios installed. I did the upgrade 3 days ago in an attempt to solve this issue.
Also, I did try DDU. As mentioned in the OP I used it to completely remove the latest Nvidia driver and then I installed an older version from December.
I now downloaded memtest64 and ran 2 loops. I honestly don't know if I did this one right, but here's a screenshot with the results:
 

Spoiler

image.png.c9f9006774b210337abe9de3a25c3c77.png

 

3 hours ago, Mark Kaine said:

Also check event viewer for "critical" errors.

Especially WHEA errors but also everything else can be important if it's "critical".

I'm going to need some help with the event viewer. I have no idea how to navigate that thing.

Link to post
Share on other sites

1 hour ago, HeLiOn said:

This thing occurs on both Linux and Windows.
That's why I assumed it's either a hardware issue, or something wrong at a driver level.
Or some weird stuff between the motherboard and the GPU...

I have the latest bios installed. I did the upgrade 3 days ago in an attempt to solve this issue.
Also, I did try DDU. As mentioned in the OP I used it to completely remove the latest Nvidia driver and then I installed an older version from December.
I now downloaded memtest64 and ran 2 loops. I honestly don't know if I did this one right, but here's a screenshot with the results:

Keep in mind I didn't mean it can't be the GPU, it just doesn't sound like it to me... Did you run some benchmarks? I recommend 3dmark FIRESTRIKE... it definitely tends to crash if "something" is wrong, worth a try and might be interesting.

 

As for event viewer it's super easy... Just open it, first category is overview...

 

Annotation2025-03-15215802_100130.thumb.jpg.14345335d8f20a9c3936d5b56c410910.jpg

 

If there are any critical errors just click on them and post a screenshot of the error descriptions.

The direction tells you... the direction

-Scott Manley, 2021

 

 

Link to post
Share on other sites

2 hours ago, HeLiOn said:

I honestly don't know if I did this one right,

My bad, I meant memtest86, you gotta boot it from USB... (memtest64 isn't as good) But it's a shot in the dark anyways, it's entirely possible to not fail the the test and still have bad RAM... As it's often a compatibility issue and the RAM may not be defective at all (and that's what memtest tests...) 

 

https://www.memtest86.com/

 

 

The direction tells you... the direction

-Scott Manley, 2021

 

 

Link to post
Share on other sites

2 hours ago, Mark Kaine said:

Keep in mind I didn't mean it can't be the GPU, it just doesn't sound like it to me... Did you run some benchmarks? I recommend 3dmark FIRESTRIKE... it definitely tends to crash if "something" is wrong, worth a try and might be interesting.

 

As for event viewer it's super easy... Just open it, first category is overview...

If there are any critical errors just click on them and post a screenshot of the error descriptions.

The event viewer only shows the forced restarts I have to do, in order to get the PC running again.
This last crash I had, I tried to see if I can still operate the PC through Teamviewer, but when the crash occurs, Teamviewer also stops receiving video signal.
image.thumb.png.a84138d4e22f1d7f24babe6916414954.png

Link to post
Share on other sites

10 hours ago, HeLiOn said:

.

use old drivers that are 566.xx, do not use any of the 57x.xx drivers, they have power issues imho. if it persists then i hope it's something else and not permanent damage.

 

I'm still waiting to see if my 3080ti is cooked, already looking for a new gpu near msrp.

5950x 1.33v 5.05 4.5 88C 195w ll R20 12k ll drp4 ll x570 dark hero ll gskill 4x8gb 3666 14-14-14-32-320-24-2T (zen trfc)  1.45v 45C 1.15v soc ll 6950xt gaming x trio 325w 60C ll samsung 970 500gb nvme os ll sandisk 4tb ssd ll 6x nf12/14 ippc fans ll tt gt10 case ll evga g2 1300w ll w10 pro ll 34GN850B ll AW3423DW

 

9900k 1.36v 5.1avx 4.9ring 85C 195w (daily) 1.02v 4.3ghz 80w 50C R20 temps score=5500 ll D15 ll Z390 taichi ult 1.60 bios ll gskill 4x8gb 14-14-14-30-280-20 ddr3666bdie 1.45v 45C 1.22sa/1.18 io  ll EVGA 30 non90 tie ftw3 1920//10000 0.85v 300w 71C ll  6x nf14 ippc 2000rpm ll 500gb nvme 970 evo ll l sandisk 4tb sata ssd +4tb exssd backup ll 2x 500gb samsung 970 evo raid 0 llCorsair graphite 780T ll EVGA P2 1200w ll w10p ll NEC PA241w ll pa32ucg-k

 

prebuilt 5800 stock ll 2x8gb ddr4 cl17 3466 ll oem 3080 0.85v 1890//10000 290w 74C ll 27gl850b ll pa272w ll w11

 

Link to post
Share on other sites

1 hour ago, xg32 said:

use old drivers that are 566.xx, do not use any of the 57x.xx drivers, they have power issues imho. if it persists then i hope it's something else and not permanent damage.

I did try that already. Mentioned it in the original post.
 

4 hours ago, Mark Kaine said:

My bad, I meant memtest86, you gotta boot it from USB... (memtest64 isn't as good) But it's a shot in the dark anyways, it's entirely possible to not fail the the test and still have bad RAM... As it's often a compatibility issue and the RAM may not be defective at all (and that's what memtest tests...) 

https://www.memtest86.com/

I performed a memory test and it passed.

These are the results:
 

Spoiler


image.thumb.jpeg.5af1af63b3771e81dc1610182a404a91.jpeg

image.thumb.jpeg.ca01ea6031470e6a5064ad49a2676b37.jpeg

 

4 hours ago, Mark Kaine said:

Did you run some benchmarks? I recommend 3dmark FIRESTRIKE... it definitely tends to crash if "something" is wrong, worth a try and might be interesting

 

I can try, but I'm not sure if they'll be relevant here.
This crash is completely random.
Sometimes I get to game for hours and nothing happens, just for it to crash when simply browsing the web.
 

Link to post
Share on other sites

18 hours ago, HeLiOn said:

The event viewer only shows the forced restarts I have to do, in order to get the PC running again.
This last crash I had, I tried to see if I can still operate the PC through Teamviewer, but when the crash occurs, Teamviewer also stops receiving video signal.
image.thumb.png.a84138d4e22f1d7f24babe6916414954.png

Tbh... It's probably RAM... I know you won't believe that - but after 100s.... 1000s...? Of very similar cases.... It's either RAM or some driver issues (but since there's nothing in event viewer it's just way more likely RAM... sometimes there can be WHEA errors, but not even that)

 

A BIOS update could fix it, new ram could, new motherboard, etc, etc...

 

Of course it could be GPU or CPU as well... It's just very - very unlikely (since nothing really points to that) random crashes with no relevant traces? Yeah, it's gotta probably be the ram/memory controller......

 

You probably think ram "just works" but it's far from it...

 

IF you get new ram, do some research what other people use with your mobo etc.

 

Alternatively: just start swapping parts until you find the culprit, aka traditional troubleshooting (I just know I'd start with ram lol)

The direction tells you... the direction

-Scott Manley, 2021

 

 

Link to post
Share on other sites

The direction tells you... the direction

-Scott Manley, 2021

 

 

Link to post
Share on other sites

5 hours ago, Mark Kaine said:

You probably think ram "just works" but it's far from it...

I have enough experience to know better.

And I did careful research before I even bought these parts.
I know, a lot of people get these things wrong, so I'd question this too.

5 hours ago, Mark Kaine said:

My specs are hidden in my signature. You just need to expand it.
And yes, the RAM I have are in there:
image.thumb.png.d61875e1af35488ffa8169b646e6b48c.png

 

Since I couldn't afford to leave the PC like this until tomorrow (it would affect my work), I swapped GPUs with an older PC, like @Mumintroll suggested.
I got my old 2080 Super in here now and the 4080 Super is now on a Gigabyte Aorus Gaming 7 motherboard along with an Intel i7 8700K.
I did the swap around 4 hours ago. Since then, that PC has stayed open, and I even went through the prologue of The Last of Us Part One, on 4K max settings.
So far, there hasn't been an incident on any of the rigs, but I'll keep things like this for another one or two days, and see if any the setups acts up.

For now I'll just keep the thread open and come back with an update.

Link to post
Share on other sites

I can finally rule out this being a hardware issue (and sleep better at night).
Ever since I swapped the GPUs, I haven't had a single incident on any of the rigs.
So yesterday. I finally swapped them back, and so far, everything has been going fine.

Since these stopped happening after the first swap, I think I'll just pass it on the GPU being somehow improperly seated.
One of the PCI connections must have been broken, I don't know...
I'll post if I find something else, but for now I'll just enjoy the smooth sailing.
Thank you for the help, guys...

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×