Jump to content

rtx 3080 crashing possibly due to capacitor choice

spartaman64
10 hours ago, RejZoR said:

Rubbish. 

You say "rubbish" but most of your post seems to be agreeing with what I said? Can you please clarify what you're rubbishing?

 

- That the minimum standards set out by NVIDIA are for the clocks speeds in the base specification and not "whatever arbitrary clock speed random AIB #1 wants to chase"? 

- That AIBs are responsible for setting boost clock speeds for their cards?

- That AIBs seek higher headline boost clock speeds because they sell more cards of they perform better or are perceived to do so?

- That some AIBs have used cheaper power regulation designs than others, or NVIDIA?

- That instability under certain conditions when boosting should have been identified before the cards hit the market?

[ P R O J E C T _ M E L L I F E R A ]

[ 5900X @4.7GHz PBO2 | X570S Aorus Pro | 32GB GSkill Trident Z 3600MHz CL16 | EK-Quantum Reflection ]
[ ASUS RTX4080 TUF OC @3000MHz | O11D-XL | HardwareLabs GTS and GTX 360mm | XSPC D5 SATA ]

[ TechN / Phanteks G40 Blocks | Corsair AX750 | ROG Swift PG279Q | Q-Acoustics 2010i | Sabaj A4 ]

 

P R O J E C T | S A N D W A S P

6900K | RTX2080 | 32GB DDR4-3000 | Custom Loop 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, HM-2 said:

You say "rubbish" but most of your post seems to be agreeing with what I said? Can you please clarify what you're rubbishing?

 

- That the minimum standards set out by NVIDIA are for the clocks speeds in the base specification and not "whatever arbitrary clock speed random AIB #1 wants to chase"? 

- That AIBs are responsible for setting boost clock speeds for their cards?

- That AIBs seek higher headline boost clock speeds because they sell more cards of they perform better or are perceived to do so?

- That some AIBs have used cheaper power regulation designs than others, or NVIDIA?

- That instability under certain conditions when boosting should have been identified before the cards hit the market?

NVIDIA's own boosting system doesn't keep ANYWHERE close to advertised boost clock, but far higher. That's just some arbitrary guaranteed clock that all cards will achieve unless you literally stuff socks into its fans... Or you happen to run your PC inside an oven while you're baking a pizza...

 

To me most confusing part is why cards suddenly decide to just bump up the clock well outside of what was running for the last hour and crash. Not one report said the clock was steadily going up and then it crashed or was sitting at same level and crashed. It's always a sudden illogical spike that murders it. To me that seems like something is really bad in the boosting algorithm and then paired with the crappy caps it causes all these problems. I don't see any reason why it would otherwise suddenly have such a spike in clock increase.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, RejZoR said:

NVIDIA's own boosting system doesn't keep ANYWHERE close to advertised boost clock, but far higher. 

By "NVIDIA's own boosting system" are you referring to boost on FE cards, or the logic within the drivers which enables boosting in the first place? My understanding was that, in the case of the latter, how far cards will be able to boost with standard out-of-the-box clocks (IE no manual overclocking) is defined by set values within the card's vBIOS. If cards are becoming unstable before hitting these points, then the barrier needs to be lowered.

 

If I've completely misunderstood or misappreciated how GPU boost works then please correct me 

[ P R O J E C T _ M E L L I F E R A ]

[ 5900X @4.7GHz PBO2 | X570S Aorus Pro | 32GB GSkill Trident Z 3600MHz CL16 | EK-Quantum Reflection ]
[ ASUS RTX4080 TUF OC @3000MHz | O11D-XL | HardwareLabs GTS and GTX 360mm | XSPC D5 SATA ]

[ TechN / Phanteks G40 Blocks | Corsair AX750 | ROG Swift PG279Q | Q-Acoustics 2010i | Sabaj A4 ]

 

P R O J E C T | S A N D W A S P

6900K | RTX2080 | 32GB DDR4-3000 | Custom Loop 

Link to comment
Share on other sites

Link to post
Share on other sites

23 minutes ago, HM-2 said:

By "NVIDIA's own boosting system" are you referring to boost on FE cards, or the logic within the drivers which enables boosting in the first place? My understanding was that, in the case of the latter, how far cards will be able to boost with standard out-of-the-box clocks (IE no manual overclocking) is defined by set values within the card's vBIOS. If cards are becoming unstable before hitting these points, then the barrier needs to be lowered.

 

If I've completely misunderstood or misappreciated how GPU boost works then please correct me 

idk about nvidia but many of the AIB cards boost to beyond 2000mhz on factory settings which seems to be the problem point for most of them

Link to comment
Share on other sites

Link to post
Share on other sites

Tech Jesus weighs in on the crashing issues

 

 

TL;DW - Neither the gimped drivers given to AIBs and nor nVidia's black box testing system caught either the boost behavior issue or the capacitor issue. Partly on the fault of the AIBs for rushing to market, but as much at fault of nVidia for poor spec requirements and handling of drivers for testing.

 

Addendum - Apparently the new drivers solve the capacitor issue.

Link to comment
Share on other sites

Link to post
Share on other sites

On 9/29/2020 at 6:06 PM, ravenshrike said:

Tech Jesus weighs in on the crashing issues

 

 

TL;DW - Neither the gimped drivers given to AIBs and nor nVidia's black box testing system caught either the boost behavior issue or the capacitor issue. Partly on the fault of the AIBs for rushing to market, but as much at fault of nVidia for poor spec requirements and handling of drivers for testing.

 

Addendum - Apparently the new drivers solve the capacitor issue.

did he actually...test it...?

 

and is there a performance loss with the new drivers?

 

(just curious but that's the important questions imo)

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

@Mark Kaine

If NVIDIA fixed the boosting logic, there should be no loss in performance. Because from what I could gather, GPU Boost gets overzealous at random point and boosts instantly to absurd clock past 2GHz, resulting in power delivery failing to meet that demand and just gives up on life and crashes. It's why everyone experiencing it and testing it with MSI Afterburner log a massive clock spike just before the crash. Which means if NVIDIA fixed the boosting via driver, you won't be missing on any performance as it'll still boost to same clocks as pre-spike situation, it just won't do the spike anymore. Which is a preferred solution, where manual decreasing of GPU clock means it'll run below advertised specs across the board, even when reaching normal boost and might hover at around 1900MHz instead of 1950 that it does now. One may say it's insignificant, but all tests are done mostly at 1950 which is on average most common boost clock, so why being sold on a fake promise then? I hope it's just this issue with boosting error which should mean even cards with 6 large black caps should be perfectly fine. But weren't before because that spike just rammed them into the ground regardless. Overbuilt cards just soaked that spike better and carried on despite the boosting error. That's my theory, but I can't test any of it since I don't have RTX 3080, just knowing how my GTX 1080Ti behaves...

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Mark Kaine said:

Snip

 

1 hour ago, RejZoR said:

Snip

Clock seems to be lowered a bit.

https://mobile.twitter.com/aschilling/status/1310839749385613312

https://www.pcworld.com/article/3583894/nvidia-fix-rtx-3080-crashes-new-drivers-clock-speed.html

"As I said, with the original drivers, the HZD benchmark ran at a mostly consistent 2010MHz on this GPU until it hit the 2025MHz wall and crapped out. With Nvidia’s new 456.55 drivers, the the GPU clock speed in Horizon Zero Dawn now flitters between 1980MHz and 1995MHz (though it can still hit a stable 2010MHz in menu screens). Nvidia’s fix appears to be dialing back maximum performance with the GPU Boost feature."

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, RejZoR said:

If that's the case, then that sucks.

I guess they made a universal fix, so that consumers don't need to deal with vBIOS update on select boards.

The different in MHz, thought, shouldn't make any notable performance difference. Maybe 1fps difference... and even then, this is within margin of error.

 

But yes, it still sucks, as technically it would indicate that technically select GPUs (max boost is based on chip abilities... this may explain why some people don't have the issue, as their GPU never had the capability to clock higher to a point that it can fail) can go higher up, but now can't without OC'ing

Link to comment
Share on other sites

Link to post
Share on other sites

Margin of error difference. Ok then. Question now is, does it really fix the crashing for those that were affected (TUF here allegedly wasn't affected). If it does, then job well done NVIDIA. Driver fixes are always better than VBIOS coz people avoid doing that thing not to mention most people don't even know how.

Link to comment
Share on other sites

Link to post
Share on other sites

11 hours ago, GoodBytes said:

The different in MHz, thought, shouldn't make any notable performance difference. Maybe 1fps difference... and even then, this is within margin of error.

Going by an earlier quote it is up to about 2% lower clock, so if the workload is entirely core limited (not vram speed or CPU impacted at all) that's the most difference we could expect. Ideally we need average clocks not peak since the boost curve may have changed, so the actual difference may be less than that. 

Main system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, Corsair Vengeance Pro 3200 3x 16GB 2R, RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

On 9/29/2020 at 12:09 PM, RejZoR said:

To me most confusing part is why cards suddenly decide to just bump up the clock well outside of what was running for the last hour and crash. Not one report said the clock was steadily going up and then it crashed or was sitting at same level and crashed. It's always a sudden illogical spike that murders it. To me that seems like something is really bad in the boosting algorithm and then paired with the crappy caps it causes all these problems. I don't see any reason why it would otherwise suddenly have such a spike in clock increase.

No. The clock spiking is not the reason for the crash but a mere symptom. A supply transient kicks the card out of a stable state. We don't know the cause yet. Maybe temperature or power measurement glitches and the card think it can boost much higher? Maybe some parts of the chip switch of due to the crash and the rest of the chip will take the additional power to reach a new high? It could be anything.

 

BTW: der8auer did test different caps and there is a correlation to maximum stable clocks:

 

Link to comment
Share on other sites

Link to post
Share on other sites

 


Some points in this video I found interesting:

According to Gear Seekers, a 3080 MIB model with MLCC that crashed in Windows never crashed in Linux hitting the same boost clocks. It's very likely to have been a driver issue in Windows from the beginning.
Also according to Harboxed, their 3080 saw the same boost clocks with the new driver, as the old driver that did crash. 
And lastly Linux has not recieved a driver update while Windows has, further idicating it was probably a driver issue.

Link to comment
Share on other sites

Link to post
Share on other sites

But why the lowered clocks then in new driver? That's the weird part then, especially if it's hitting higher ones in Linux without crashing.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Medicate said:

According to Gear Seekers, a 3080 MIB model with MLCC that crashed in Windows never crashed in Linux hitting the same boost clocks. It's very likely to have been a driver issue in Windows from the beginning.
Also according to Harboxed, their 3080 saw the same boost clocks with the new driver, as the old driver that did crash. 
And lastly Linux has not recieved a driver update while Windows has, further idicating it was probably a driver issue.

Stability not only depends on the clock but also utilization of the cores. A higher utilization will reduce the stable clock speed. This is also true for AMD and Intel CPUs. If you're hitting them hard, they won't be able to boost as high. I would highly doubt utilizsation is equal on Linux and Windows, thus the small difference in clock speed.

34 minutes ago, RejZoR said:

But why the lowered clocks then in new driver? That's the weird part then, especially if it's hitting higher ones in Linux without crashing.

They didn't. Watch this video:

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Medicate said:

 


Some points in this video I found interesting:

According to Gear Seekers, a 3080 MIB model with MLCC that crashed in Windows never crashed in Linux hitting the same boost clocks. It's very likely to have been a driver issue in Windows from the beginning.
Also according to Harboxed, their 3080 saw the same boost clocks with the new driver, as the old driver that did crash. 
And lastly Linux has not recieved a driver update while Windows has, further idicating it was probably a driver issue.

idk it could just be the drivers for linux isnt as efficient so it cant use the gpu to its full potential 

Link to comment
Share on other sites

Link to post
Share on other sites

Funny enough, in the past I couldn't get my GTX 1080Ti to run stable at 2GHz. It would just always go bad when it hit 2GHz mark. Now I'm using same settings as I've tried before and it'll boost to 2GHz (using +75MHz and 50% voltage wih all limits raised to max) and remain at that through entire duration of gaming in lets say Killing Floor 2. Where before it was mostly hovering around 1950MHz. If I pushed it higher it always crashed. I wonder if I can push it past 2GHz now... Find it hard to believe this driver would also affect GTX 1080Ti since it's 2 generations old and I don't think NVIDIA would care to fiddle with it. Unless the GPU Boost modification affects all the cards across the board that use GPU Boost mechanism.

Link to comment
Share on other sites

Link to post
Share on other sites

The only time when you wish miners and bots would buy all of them, so you can get the 2nd batch that has the problem fixed.

Intel Xeon E5 1650 v3 @ 3.5GHz 6C:12T / CM212 Evo / Asus X99 Deluxe / 16GB (4x4GB) DDR4 3000 Trident-Z / Samsung 850 Pro 256GB / Intel 335 240GB / WD Red 2 & 3TB / Antec 850w / RTX 2070 / Win10 Pro x64

HP Envy X360 15: Intel Core i5 8250U @ 1.6GHz 4C:8T / 8GB DDR4 / Intel UHD620 + Nvidia GeForce MX150 4GB / Intel 120GB SSD / Win10 Pro x64

 

HP Envy x360 BP series Intel 8th gen

AMD ThreadRipper 2!

5820K & 6800K 3-way SLI mobo support list

 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×