Jump to content

RTX 3080 Crash

sindain

Ok so this is getting more attention and I'm dealing with a few customers with the issue so I'd like to get some proper info to take to a rep at nvidia if needed.

So lets stack the info.

 

https://strawpoll.com/5fpuga3xa

Don't hate the game, stab the player.

Link to comment
Share on other sites

Link to post
Share on other sites

44 minutes ago, boasty said:

It's not the memory modules getting too hot under boost clock?  Card temperature might be showing ok but perhaps the memory is getting too hot. There have been tests done with the ddr6+ memory on the 3080 and some of the chips were getting over 100 degrees. Might be enough to create a memory error and crash to desktop. The article I read about the issue:-

https://www.igorslab.de/en/gddr6x-am-limit-ueber-100-grad-bei-der-geforce-rtx-3080-fe-im-chip-gemessen-2/

 

Hopefully it's not that and just a driver issue but I think it's a useful article and worth knowing about..

Its unlikely. Memory overheating will usually cause artifacts, black screen crashes and/or system shutdowns with fans ramping up. More importantly, when underclocking by 50-100Mhz the problem is gone for most people. This has little to no effect on the memory temperature. 
 

Trust me, this ain’t a memory issue. Its the cards not being stable at the insane factory boost speeds, hence underclocking a little makes the problem disappear. Why the cards aren’t stable at their factory OC remains unclear to me. But I am positive this isn’t a temperature or PSU issue. As a matter of fact, when I set my fan speed to 100% the games crash even quicker because the low temperature makes the boost clock go even higher, resulting in the crash. 

Link to comment
Share on other sites

Link to post
Share on other sites

39 minutes ago, cadmachine said:

Ok so this is getting more attention and I'm dealing with a few customers with the issue so I'd like to get some proper info to take to a rep at nvidia if needed.

So lets stack the info.

 

https://strawpoll.com/5fpuga3xa

Is it correct that your poll only

has one question? If so, I can tell you right now that I am confident that the issue is not caused by PSU’s (obviously some people will have issues because of their PSU but this widespread issue is in most cases not caused by power supplies). 
 

I talked to people who have tried switching to titanium certified 1000W PSU’s and still have the issue. Also I measured power draw and with the 50mhz underclock the power draw remains the same but the issue is then gone, so thats another indicator that the main cause of this issue does not lay with the PSU’s. I hope you can get some info to Nvidia! That’d be amazing. 
 

Also I tried reaching out to Nvidia on twitter but have had no succes. Feel free to retweet or use the info in the post if you want to! 

 

Edited by zarthere
Link to comment
Share on other sites

Link to post
Share on other sites

No problem mate, I don't own a 3080 so I can't test it out but after seeing the article on the memory chips thought it might be important for you guys to know about it, if you had not seen it already. What a pain in the arse, hopefully a new driver will sort it out for you guys. I think I read somewhere the media driver seems to be working better than the game ready driver as I believe the media driver does not boost up so high. Hope it's sorted out for you quickly and you can just enjoy your new card without this hassle mate.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, boasty said:

No problem mate, I don't own a 3080 so I can't test it out but after seeing the article on the memory chips thought it might be important for you guys to know about it, if you had not seen it already. What a pain in the arse, hopefully a new driver will sort it out for you guys. I think I read somewhere the media driver seems to be working better than the game ready driver as I believe the media driver does not boost up so high. Hope it's sorted out for you quickly and you can just enjoy your new card without this hassle mate.

Tell me about it, I'm hoping at the very least AIBs release new Bios updates to help with those who didn't win the Silicon lottery. As it's something when i can run my card at Default Stocks (Debug mode) and it runs fine, But the moment i allow the Factory Overclock to kick in then the crashing begins.

Link to comment
Share on other sites

Link to post
Share on other sites

Hi. I don’t have an RTX 3080, but I know what this issue is as I have experienced it many times with 2080TI. 

  1. Issue: in certain scenes the card will boost to abnormally high core clock speed while the vcore voltage is not raised correctly causing a crash to desktop.
  2. fix1 : download MSI afterburner and lower the core clock by 25-50 MHz
  3. fix2: download MSI afterburner and define a custom Voltage/core clock curve. 

Step 2 Or 3 will fix your issue 100% 

Link to comment
Share on other sites

Link to post
Share on other sites

🤣This gosh damn GPU just hit the market hot off the presses and people are already having issues with it? Man 😬.....

System Specs

  • CPU
    AMD Ryzen 7 5800X
  • Motherboard
    Gigabyte AMD X570 Auros Master
  • RAM
    G.Skill Ripjaws 32 GBs
  • GPU
    Red Devil RX 5700XT
  • Case
    Corsair 570X
  • Storage
    Samsung SSD 860 QVO 2TB - HDD Seagate B arracuda 1TB - External Seagate HDD 8TB
  • PSU
    G.Skill RipJaws 1250 Watts
  • Keyboard
    Corsair Gaming Keyboard K55
  • Mouse
    Razer Naga Trinity
  • Operating System
    Windows 10
Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, zarthere said:

Is it correct that your poll only

has one question? If so, I can tell you right now that I am confident that the issue is not caused by PSU’s (obviously some people will have issues because of their PSU but this widespread issue is in most cases not caused by power supplies). 
 

 

It may not be a fault or power draw issue which you seem to be pre-disposing but it might be, it might also be a driver issue with certain power cycling in certain power supplies, or a number of other issues, but the amount of people with the complaint who have 750 watt PSU's can not be overlooked in the investigative phase, both customers I'm dealing with have different manufacturer built 750 watt PSU's for the complaint here on reddit and on tom's and I dont see how you can categorically rule out power supply because of one anecdotal case?

Don't hate the game, stab the player.

Link to comment
Share on other sites

Link to post
Share on other sites

did you plug each 8 pin to separate plug from the psu ?

the psu ports 1 for each 8 pin of the gpu
dont use the cable which splits into 16 pins (8 + 8 )

Link to comment
Share on other sites

Link to post
Share on other sites

On 9/20/2020 at 8:01 AM, Rickckk said:

@sindain Bro.. I got the exact same GPU (Zotac trinity 3080) and the same problem! I was able to solve it by simply underclocking the Core clock by 110mhz! Power consumption set 90%! Try it! Thank me later.. Not psu.. not cpu.. not driver.. not all of what were mentioned by the people.. 

that is not good either
you buy a gpu , should get the perfect experience out of the box, under clocking it = losing performance you paid for.

so if the normal power target 100% not working meaning you got power problem either on psu or on pcb of the gpu either way either upgrade psu or return gpu to warranty

and if the power 90% is not solving it then its not stable clock speed of the card =
the gpu shipped to you overclocked by zotac,msi,asus etc.. any of these aib unstable , so its their fault = return

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Haris Javed said:

Hi. I don’t have an RTX 3080, but I know what this issue is as I have experienced it many times with 2080TI. 

  1. Issue: in certain scenes the card will boost to abnormally high core clock speed while the vcore voltage is not raised correctly causing a crash to desktop.
  2. fix1 : download MSI afterburner and lower the core clock by 25-50 MHz
  3. fix2: download MSI afterburner and define a custom Voltage/core clock curve. 

Step 2 Or 3 will fix your issue 100% 

Hey, thanks but this has already been mentioned by multiple users here. Sadly this isn’t a good fix because the cards should work out of the box, without having to adjust clock speeds. But yeah 50-100Mhz underclocks get rid of the crashes. 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, cadmachine said:

It may not be a fault or power draw issue which you seem to be pre-disposing but it might be, it might also be a driver issue with certain power cycling in certain power supplies, or a number of other issues, but the amount of people with the complaint who have 750 watt PSU's can not be overlooked in the investigative phase, both customers I'm dealing with have different manufacturer built 750 watt PSU's for the complaint here on reddit and on tom's and I dont see how you can categorically rule out power supply because of one anecdotal case?

I’v talked to many people on reddit and other forums and many of them have good 1000W power supplies and some of them tried brand new PSU’s as well bit the problem remains. So I think its very unlikely that the main cause of this issue is the PSU. I’d say its almost certain that the cards have some sort of issues or the drivers. But yeah we can’t really rule out anything until we get some feedback and support from Nvidia and partners. But they remain silent :(

Link to comment
Share on other sites

Link to post
Share on other sites

On 9/22/2020 at 6:21 PM, Manju1 said:

Mhm strange thing still is that downclocking by -50 resolves the issue. I am definitely not an expert and it still might be the power supply, but -50 MHz downclock doesent seem so much less that the power supply is suddenly enough for the card.

 Well, 50MHz can be a big deal when your frequency is near your chip’s frequency limit. My 3600xt can stabilize 4475MHz @1.25V, and only stabilize 4500MHz @1.35V. The extra 25MHz will bump power draw from ~80W to ~95W, and push temperature from 84C to beyond 100C and force reboot. 
 

So, 50MHz can be a big deal. You should take note of your current clock, temp, and power draw. Try a few more clock speeds and plot the trend. 

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, zarthere said:

I’v talked to many people on reddit and other forums and many of them have good 1000W power supplies and some of them tried brand new PSU’s as well bit the problem remains. So I think its very unlikely that the main cause of this issue is the PSU. I’d say its almost certain that the cards have some sort of issues or the drivers. But yeah we can’t really rule out anything until we get some feedback and support from Nvidia and partners. But they remain silent :(

According FCP, peak current draw can be really high even for 2080TI (over 50A, that’s 600W alone for 12V). Now when your whole system is firing up a game, cpu/ram/drive/gpu are all boosted simultaneously to run the game, combined power draw could be taking toll. 
 

There was an user reporting crashes after gaming for 20 min (he uses zotac and evga 850w). It could be a result of his psu getting too hot under such loads to sustain such output. 
 

This requires more testing, as PSUs aren’t the same. There are too many variables in PSU design. 

 

A side note, consider two cases : A) current fluctuating rapidly between 30-50A, averaging 40 A. B) constant 40A power draw. Case A is close to gaming load, while B stress test. You can check reviews that show power consumption versus time during tests. Actually case A will cause power supply/ VRM to be hotter. 

Link to comment
Share on other sites

Link to post
Share on other sites

MSI just mentioned in there live stream that they are aware and investigating the crashes that people are having. They seem to think it's a NVIDIA driver issue but nothing confirmed. NVIDIA is also aware.

Link to comment
Share on other sites

Link to post
Share on other sites

47 minutes ago, foylema said:

MSI just mentioned in there live stream that they are aware and investigating the crashes that people are having. They seem to think it's a NVIDIA driver issue but nothing confirmed. NVIDIA is also aware.

Good to know, thanks for the update. Kind of sad they can't communicate this in public - and I especially mean nvidia as they seem to close all negative comments on their subreddit and forum.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, foylema said:

MSI just mentioned in there live stream that they are aware and investigating the crashes that people are having. They seem to think it's a NVIDIA driver issue but nothing confirmed. NVIDIA is also aware.

Hey thanks for sharing, when was this livestream and could you link it so I can watch it back? 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, zarthere said:

Hey thanks for sharing, when was this livestream and could you link it so I can watch it back? 

I searched it up but it was literally only the sentence @Apington mentioned. On the MSI Main channel.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Manju1 said:

I searched it up but it was literally only the sentence @Apington mentioned. On the MSI Main channel.

Is it  this video? 

If so, what time stamp if I may ask? 

Link to comment
Share on other sites

Link to post
Share on other sites

On the flip, NVIDIA Acknowledged the issue and is now looking into it. 

Source: Comment from NV_Tim

 

On the note of that MSI Video, I suppose someone can skim over it to see that they said something, Though with the format that the stream is presented i'd be surprised if nobody asked MSI what's going on there. Though someone double check it, Or provide a time stamp please.

Link to comment
Share on other sites

Link to post
Share on other sites

100% looks like a PSU issue, power stability.

 

Watching the LTT 3090 review @11 minutes they showed power consumption over 30 minutes. The slopes of the changes in power draw and the power consumption indicate high transients, even if these ramp ups are over a second or two. High transients mean voltage drops. Voltage drops create errors in signalling, hence crashes.

 

A high power (1kW+) bronze unit from a reputable brand would most likely solve this issue if they still use analogue components instead of power electronics. Basically these cards need bigger capacitor's either in the PSU or on the GPU. I would say just solder a high farad cap across the 12V and 0V rails but over current protection should detect that and keep the PSU off (plus, it is stupidly dangerous...).

 

The reason why declocking the card would increase stability is the same reason over clocking should increase stability. You are changing the ramp up profile from an aggressive, automated model which can change state every 25ms (IIRC) to a less aggressive model designed to make overclocking more stable.

 

This looks like it could 100% be fixed with a driver update, but you will loose performance. I also would not blame AiB's as this might be due to a driver change after their initial validation testing.

 

tl;dr: High capacity (1kW+), low efficiency, reputable PSU should solve your issues.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Puss said:

100% looks like a PSU issue, power stability.

 

Watching the LTT 3090 review @11 minutes they showed power consumption over 30 minutes. The slopes of the changes in power draw and the power consumption indicate high transients, even if these ramp ups are over a second or two. High transients mean voltage drops. Voltage drops create errors in signalling, hence crashes.

 

A high power (1kW+) bronze unit from a reputable brand would most likely solve this issue if they still use analogue components instead of power electronics. Basically these cards need bigger capacitor's either in the PSU or on the GPU. I would say just solder a high farad cap across the 12V and 0V rails but over current protection should detect that and keep the PSU off (plus, it is stupidly dangerous...).

 

The reason why declocking the card would increase stability is the same reason over clocking should increase stability. You are changing the ramp up profile from an aggressive, automated model which can change state every 25ms (IIRC) to a less aggressive model designed to make overclocking more stable.

 

This looks like it could 100% be fixed with a driver update, but you will loose performance. I also would not blame AiB's as this might be due to a driver change after their initial validation testing.

 

tl;dr: High capacity (1kW+), low efficiency, reputable PSU should solve your issues.

Been there, done that.

On 9/23/2020 at 9:07 PM, Apington said:

Just replaced my Corsair AX850 Titanium (bought for this new lovely gpu - MSI Ventus 3X 10G OC) with a beQuiet Straight Power 11 1200W Platinum.

 

Positive:

- far less coil whine

- 3D Mark got me some points (12 928) ( https://www.3dmark.com/3dm/50743594?

- 3D Mark stress test was fine (98.6%) https://www.3dmark.com/3dm/50746602?
- temps are good (71 °C)

 

Sadly the negative points are still the same... almost any game I try to launch dies (CTD) within the first 1-2min at it's best (e.g. Borderlands3, Warzone crash right after loading screen).

As stated before I can only hope that a new driver can fix this. The benchmarks just seem too good for a defect chip - at least I guess so.

 

Slowly you can find more and more voices / threads about 3080s not running any games - https://wccftech.com/nvidia-geforce-rtx-3080-users-report-crashes-black-screens-during-gaming/ (the last summary I found)
 

If anybody has suggestions or questions feel free to ask, I will provide whatever info I can.

PS: Without any fixes for the next week or so I will have to return it to my reseller. Not paying 700€+ for non functioning hardware.

 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Puss said:

100% looks like a PSU issue, power stability.

 

Watching the LTT 3090 review @11 minutes they showed power consumption over 30 minutes. The slopes of the changes in power draw and the power consumption indicate high transients, even if these ramp ups are over a second or two. High transients mean voltage drops. Voltage drops create errors in signalling, hence crashes.

 

A high power (1kW+) bronze unit from a reputable brand would most likely solve this issue if they still use analogue components instead of power electronics. Basically these cards need bigger capacitor's either in the PSU or on the GPU. I would say just solder a high farad cap across the 12V and 0V rails but over current protection should detect that and keep the PSU off (plus, it is stupidly dangerous...).

 

The reason why declocking the card would increase stability is the same reason over clocking should increase stability. You are changing the ramp up profile from an aggressive, automated model which can change state every 25ms (IIRC) to a less aggressive model designed to make overclocking more stable.

 

This looks like it could 100% be fixed with a driver update, but you will loose performance. I also would not blame AiB's as this might be due to a driver change after their initial validation testing.

 

tl;dr: High capacity (1kW+), low efficiency, reputable PSU should solve your issues.

Hey, people have already tried new and better PSU’s though. I have a corsair HX850 v2 platinum. I’ve talked to people

who tried switching to 1000+W titantium PSU’s without it solving the issue. I really doubt its a PSU sided issue.
 

Also, when I measured usage and stability I saw wattage sit stable between 320- 330W and the voltage was switching every few seconds or so between 1V and 1.1V. Is this < 0.1V switch not normal? Or is this actually an issue and not supposed to happen at all, making this 0.1V switches cause crashes?
 

EDIT: Oh I see that you said LOW EFFICIENCY PSU’s can be a fix aa they use different components. Wouldn’t that mean the issue lays within the power stages of the GPU, as the PSU’s are fine? 

Edited by zarthere
Link to comment
Share on other sites

Link to post
Share on other sites

I'm running a Gigabyte 3080 Gaming OC with a 1000w PSU and I'm also getting crashes in no many sky. It seems like that the only game doing it at the moment. My card gets OC'd to 2010mhz (It's advertised as 1800mhz). I had to bump it down to -125mhz for NMS for it not to crash. However I tried RDR2 and it works fine on 2010mhz. 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×