Jump to content

Gigabyte Aorus Gaming OC 4090 borked on arrival?

I purchased the Aorus RTX 4090 Gaming OC Edition from Gigabyte and installed it my system. I previously was running a 3090 with no issues. 

 

Msi z590 Pro Wifi CEC 

i7-11700k

80GB system memory 

Seasonic 1000W Gold

Windows 10 Home

 

Day 1, I installed the latest driver from Nvidia and proceeded to play around with Blender and see the performance gains. It was magical. Next, I loaded up cyberpunk and played for about 5-6 hours straight with every setting maxxed. No issues. I went to sleep after closing out and came back to the pc after waking up, only to have driver crashes while using the blender. I would use Cycles briefly before it would kill the driver. No freezes or BSOD. 

 

I tried Cyberpunk again it would crash to desktop. The CTD's would either happen at Legal screen or the rendered menu, or after 2 minutes of game play. 

 

Things I tried: 

JayzTwoCents DDU method. I restarted my pc into safemode, used DDU to clean the drivers completely off my pc. I disabled automatic windows driver install. I booted up and installed a manually downloaded clean install of the latest 522.25 driver from Nvidia. Rebooted again, started Cyberpunk, CTD'D. 

 

I checked event log and found that at every CTD, the same error was popping up. 

 

"nvlddmkm" 

 

I tried changing the permissions on this file to full access and running DDU again. Crash ingame. 

 

I tried running sfc scannow 

Cleaned all the files and replaced corrupted. 

 

Crash in game. 

 

I tried reinstalling windows with kept user files. 

 

Ran DDU again. Crash in game. 

 

Updated mobo bios. 

 

Tried different clock speeds and power limits. 

 

Tried the two separate bios options on the card. 

 

I swapped the 3090 back in and everything runs beautifully. 

 

I called Gigabyte Tech Support, and they said they don't even have technical documentation for the card yet for support. She tried helping me, but said she was stuck with googling for things at this point. 

 

I want to avoid RMA if it's solvable, but I also don't think I should be going through this much hell with a $1700 GPU straight out of its box. 

Feel free to chime in if you've had similar issues or know what might be going. 

Im at a loss. 

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, s373ns said:

$1700 GPU straight out of its box.

Keep in mind you are the q and a atm as well as a beta tester for the new product lineup. This doubly shows because gigabyte doesnt have any form of support set up yet.

 

The steps you did were correct and if its still behaving with issues then the card is very likely defective.

 

However what windows are you running?

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, jaslion said:

Keep in mind you are the q and a atm as well as a beta tester for the new product lineup. This doubly shows because gigabyte doesnt have any form of support set up yet.

 

The steps you did were correct and if its still behaving with issues then the card is very likely defective.

 

However what windows are you running?

Windows 10, latest update with that fresh install I did. My last resort is to shove it in my co-worker's system and see how it behaves. I know I shouldn't be mad about early adopter problems, because I made that decision to buy at the floodgates, but I have not found anyone else online with this much trouble with one of the cards yet. 

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, s373ns said:

Windows 10, latest update with that fresh install I did. My last resort is to shove it in my co-worker's system and see how it behaves. I know I shouldn't be mad about early adopter problems, because I made that decision to buy at the floodgates, but I have not found anyone else online with this much trouble with one of the cards yet. 

Yours might basicallt be a doa unit that had a blip of life left.

 

If your coworkers systen can handle it try it. If it works well time to start evaluating everything

Link to comment
Share on other sites

Link to post
Share on other sites

7 minutes ago, jaslion said:

Yours might basicallt be a doa unit that had a blip of life left.

 

If your coworkers systen can handle it try it. If it works well time to start evaluating everything

It can still play titles like Rainbow Sox Siege, so it's not dead-dead. It's just.... The RT cores or something? Or some sort of power issue. I have no clue. It's inconsistent with how it crashes. Only games with raytracing. 

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, s373ns said:

It can still play titles like Rainbow Sox Siege, so it's not dead-dead. It's just.... The RT cores or something? Or some sort of power issue. I have no clue. It's inconsistent with how it crashes. Only games with raytracing. 

Try a benchmark line unigene superposition and other heavy ones that dont trigger rt cores.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

It sounds like your card is borked, might not hurt to reinstall drivers from what I have heard the 40 series drivers are a little unstable on some versions of windows 10.

Link to comment
Share on other sites

Link to post
Share on other sites

Just to chime in, I'm having the exact same issue except I went so far as to reformat. Doesn't matter if its loaded, it futz out on the desktop.

 

Aorus Xtreme x570

5950X

4x32GB G.Skill

Seasonic TX-1000

 

et al

Link to comment
Share on other sites

Link to post
Share on other sites

9 hours ago, bradcool22 said:

It sounds like your card is borked, might not hurt to reinstall drivers from what I have heard the 40 series drivers are a little unstable on some versions of windows 10.

I mean, I think I've reinstalled drivers at least 15 times now. Clean install on them, too. 

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, Camm said:

Just to chime in, I'm having the exact same issue except I went so far as to reformat. Doesn't matter if its loaded, it futz out on the desktop.

 

Aorus Xtreme x570

5950X

4x32GB G.Skill

Seasonic TX-1000

 

et al

It's so frustrating, man. 

Link to comment
Share on other sites

Link to post
Share on other sites

I've done some more testing on my side.

 

I can reliably cause this fault with 3DMark's DirectX Raytracing Feature Test. Interestingly, its not during the benchmark, but once the benchmark is complete. This, along with the error, indicates to me that the card is either waiting for a state change, is trying to do something, but failing at the state change, or is a 2D Clock issue.

 

The only things I haven't done at this point is a different system or a different PSU. I just don't think its a power issue. Maybe RGB or something but I don't have anything running to control that.

 

Anyway, posted for reference for anyone else having this problem.

Link to comment
Share on other sites

Link to post
Share on other sites

@s373ns - found the issue, at least for me. Its a Windows Power Setting called PCI Express\Link State Power Management. Turn that to Off, haven't had the issue since.

 

You can also

 

A: Put Windows Power Plan to High Performance (from Balanced)
B: In Nvidia Control panel under Manage 3D settings, set Power Management Mode to Prefer Maximum Performance.

 

Cheers

Link to comment
Share on other sites

Link to post
Share on other sites

Made an account to say @Camm thank you for posting your solution - I had the same issue, and, fingers crossed, that may have fixed it. Would explain why my computer was rock solid stable during any benchmarks, but would crash just as I log into windows or such,

 

I'm curious how you managed to figure this out, just trying various things, or does windows have any logs available that help with that kind of thing?

 

edit: Interestingly, for me, this _may_ have started after updating various drivers and BIOS (to get XMP to post - suspected that instability with that was the issue first, but turning XMP back off changed nothing). Can't be sure it wasn't an issue before (I built the computer, turned it on, and basically had it permanently doing renders in blender because I was so excited about the speed since then, obviously it's not going to change from maximum power state during that, updated the BIOS after), but it could be part of it. my MB is an Aorus X670. Bios update has something to the tune of "Fix NVIDIA Geforce RTX 4090 PCIe down speed issue" in the notes, so that's interesting.

Link to comment
Share on other sites

Link to post
Share on other sites

On 10/24/2022 at 9:15 AM, h4lcy said:

Made an account to say @Camm thank you for posting your solution - I had the same issue, and, fingers crossed, that may have fixed it. Would explain why my computer was rock solid stable during any benchmarks, but would crash just as I log into windows or such,

 

I'm curious how you managed to figure this out, just trying various things, or does windows have any logs available that help with that kind of thing?

 

edit: Interestingly, for me, this _may_ have started after updating various drivers and BIOS (to get XMP to post - suspected that instability with that was the issue first, but turning XMP back off changed nothing). Can't be sure it wasn't an issue before (I built the computer, turned it on, and basically had it permanently doing renders in blender because I was so excited about the speed since then, obviously it's not going to change from maximum power state during that, updated the BIOS after), but it could be part of it. my MB is an Aorus X670. Bios update has something to the tune of "Fix NVIDIA Geforce RTX 4090 PCIe down speed issue" in the notes, so that's interesting.

As a quick update for anyone who might be having similar issues: While Gigabyte haven't commented on it anywhere I can see, they seem to have taken down the F6 bios versions from the download page on https://www.gigabyte.com/Motherboard/X670-AORUS-ELITE-AX-rev-10/support . Not sure if that's indicative of there actually having been an issue, or they just forgot when updating the pages. I asked them about it on reddit, but they haven't responded.

Link to comment
Share on other sites

Link to post
Share on other sites

On 10/24/2022 at 5:15 PM, h4lcy said:

Made an account to say @Camm thank you for posting your solution - I had the same issue, and, fingers crossed, that may have fixed it. Would explain why my computer was rock solid stable during any benchmarks, but would crash just as I log into windows or such,

 

I'm curious how you managed to figure this out, just trying various things, or does windows have any logs available that help with that kind of thing?

 

I noticed an error code other than the 0 error code that this would pop up in event viewer. That led me down a path where on previous Nvidia cards that there were issues with power management (usually after launch it seems). General knowledge over the years has me knowing that there are 4 things that control video power, Nvidia Drivers, Windows PCI link setting, Windows Power Setting, and UEFI options that control PCI power.

Link to comment
Share on other sites

Link to post
Share on other sites

Hi, I'm having a similar issue with my same 4090.

 

Driver 522 was causing monitors to turn off their screens and after a while system rebooted. This would occur either 5-6 times in 10 minutes or once every 5 days.

Driver 526 fixed this but introduced a new problem, now I get BSOD and the cause of the crash is "DPC Watchdog Violation"  which CAN be caussed by GPU driver issues with the hardware. (I know it's my card because i've been using my "old" 3080 for 2 years, never had this problem)

 

going to try what @Camm sugested about PCIE power config. Will try to report if it works. I'll give it a week or so because some crashes have been every 5 days.

Also, if a new driver releases then we all know what to do... but I don't have much hopes since the last one was only released a few days ago.

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Kratos said:

Hi, I'm having a similar issue with my same 4090.

 

Driver 522 was causing monitors to turn off their screens and after a while system rebooted. This would occur either 5-6 times in 10 minutes or once every 5 days.

Driver 526 fixed this but introduced a new problem, now I get BSOD and the cause of the crash is "DPC Watchdog Violation"  which CAN be caussed by GPU driver issues with the hardware. (I know it's my card because i've been using my "old" 3080 for 2 years, never had this problem)

 

going to try what @Camm sugested about PCIE power config. Will try to report if it works. I'll give it a week or so because some crashes have been every 5 days.

Also, if a new driver releases then we all know what to do... but I don't have much hopes since the last one was only released a few days ago.

Well...that didn't age too well...xD

Seems like I already had it on off because my power settings on windows are set to maximum performance, to add more irony, it crashed just as I was about to close the window after checking 😂

So for the new 526 driver, it didn't work for me.

Guess we'll have to wait for new drivers since we only have 2 compatible drivers with the cards.

Link to comment
Share on other sites

Link to post
Share on other sites

Hey, I'm just here to link to another thread that has a similar issue which may help in debugging, especially for @Kratos

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

FWIW: I still had (much rarer) issues with the card, never on reboot, but generally they'd start after going to sleep at least once. Driver resets, bus resets, sometimes proper hangs like before.

 

I then downgraded my motherboard BIOS to a previous version and swapped to the Studio driver, zero issues since, sleep or no, over the last 4 days.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 weeks later...

Guys,

 

I updated to the new W10 Build, newest BIOS (I have an MSI X299 Motherboard) and also newest drivers. 9 days so far without any crashes. 

I can't determine if it was the new W10 Build, new drivers or new BIOS, but so far 6 days was the longest I had without crashes, now it's in 9 days.

My bet is probably new BIOS or new drivers.

 

For everyone try updating anything and see if it works.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×