Jump to content

Erratic crashes on RTX 4080, GPU frequency drops seconds before hard shutdown.

Hey all,

 

I'm at my wits-end with a system - we've been experiencing erratic crashes in various intensive and non-intensive software, with unusual crashes occurring up to an hour after starting a game like Total War Warhammer 3, and Helldivers 2 (though given the other problems, I'm taking Total War as a better indication of problems.)

 

Initially we were also experiencing issues loading games like Rimworld, where the system would hard shutdown while just booting the game - the GPU would be seen to be spiking from 200MHz to 2590MHz, which stopped when we disabled wallpaper engine and corsair services (beats me), which also stopped issues with rimworld - we're now only experiencing problems with more intensive games, so part of me thought load or thermal, but temperature sensors report normal.

 

System specs:

Motherboard: Gigabyte X670 AORUS Elite AX
CPU: Ryzen 9 7950X3D
CPU Cooler: Corsair iCUE H150i RGB ELITE
RAM: Corsair Vengeance RGB 64GB (4x16) DDR5 6000MHz
Graphics Card: MSI GeForce RTX 4080 16GB GAMING X TRIO
SSD: WD Black SN850X 2TB
Power Supply: Corsair HX1000i (using the 2x8pin to 12VHPWR cable included)
Case: Corsair iCUE 5000X
Case Fans: Arctic F12 PWM (1 in rear)

Airflow: 3 in front pulling air in for GPU and RAD, 3 in top exhausting through rad, 1 in rear exhausting VRM heat.

 

Problem Games:

- Helldivers 2 - crashes around 10-20 minutes into the game, game hovers around 144FPS 1440p Ultra while running - I'm hoping whatever solves warhammer will solve that, if not based on reports on reddit and the like I'm tempted to say game problem.

- Total War: Warhammer 3 - crashes around 1-2 hours in - game ran at 144FPS 1440p Ultra for the entire time period before, completely stable.

 

Benchmark scores:

PassMark Performance Test 11.0 - Overall 14133.4 (99th percentile)

CPU Mark: 52132.2 (99th Percentile)

2D Graphics Mark: 983.1 (80th Percentile)

3D Graphics Mark: 30967.5 (97th Percentile)

Memory Mark: 3204.0 (74th Percentile)

Disk Mark: 56118.9 (99th Percentile)

 

System survived Prime95 and Furmark stress test with no issue.

 

What we've tried:

- Disabling iCUE & Wallpaper Engine - as aforementioned, we saw GPU spiking both when a wallpaper engine wallpaper would reload, and just every few seconds with iCUE - disabling these appeared to stop this behaviour, and the system now remains stable in non aggressive loads.

- DDUing and reinstalling the latest graphic drivers - no effect - we've noted an error regarding nvlddmkm failing to find an Event ID for the GPU, this prompted the driver reinstall, no change appeared to occur - a screenshot of the event viewer log is below.

- Turning off XMP/EXPO, and letting RAM go back down to defaults. - We had the ram clocked at 5200MHz with reccomended expo power and timing settings, we couldn't hit 6000MHz I'm assuming due to there being 4 sticks.

- Underclocking the GPU - we tried a 100MHz underclock on the GPU and VRAM, as well as capping power limit to 80%, with my thinking being that maybe setting a limit in afterburner would prevent the system spiking the power, if that were the problem - no change in behaviour.

- Running DISM and SFC - these reported damaged components and was able to repair them, but whether it actually did anything idk.

- Validating helldivers and total war on Steam - files were needing to be reacquired, but a very small amount.

- Resetting the BIOS altogether - we went back into the bios, re-enabled PBO, turned off IOMMU (bad vibes), and booted back up into Windows. All other settings remained Optimized defaults.

 

 

What we've noted:

- It's not temperature - all system sensor report nominal temperatures, with the CPU sitting just under it's thermal limit which is apparently normal.

- The power supply can be heard to click when the system hard shuts down, which in my mind would be power spike tripping the over current, but that doesn't line up with HWInfo logs right before the crash - unless the system spikes to try and catch back up from the reduced load.

- Looking at the HWInfo64 logs, I would assume driver crash as the GPU clock and power draw notably drops before the crash, which I thought might lend credibility to the idea of it being power spiking.

 

Other notes:

- PBO is currently set to 90 Level 3 - this appears stable in our testing.

- Memory overclock currently remains off, though once this is resolved I would like to get it back up, so if anyone has advice for timings please do let me know.

- Other than that the warhammer crash logged below

 

What we're considering:

- Changing GPU link speed down to Gen 3 (could it be motherboard related?)

- Reinstalling Windows (it's a headache, so I'd like to avoid if I can, but I suppose hardware RMA would be more of a headache)

- Removing one of the RAM kits so we're down to 2x16 - it seems like a stretch, but that's the only thing that I would assume to cause instability in the system, unless we got an unlucky chip.

 

Attached are HWInfo logs, both for the entire period of Total War being played, and then a condensed one of just before the crash - the last reading in the log is the last one before the crash.

 

Help me, LTT forums. You're my only hope.

AnyDesk_srSBMUw5SS.png

TotalWarWarhammerLogs_CRASH.CSV TotalWarWarhammerLogs_FULL.CSV

Link to comment
Share on other sites

Link to post
Share on other sites

Not sure on the GPU front but i will say DDR5 runs much better with 2 RAM sticks and 4 can even cause pretty decent instability

System specs:

 

 

CPU: Ryzen 7 7800X3D [-30 PBO all core]

GPU: Sapphire AMD Radeon RX 7800 XT NITRO+ [1050mV, 2.8GHz core, 2.6Ghz mem]

Motherboard: MSI MAG B650 TOMAHAWK WIFI

RAM: G.Skill Trident Z5 NEO RGB 32GB 6000MHz CL32 DDR5

Storage: 2TB SN850X, 1TB SN850 w/ heatsink, 500GB P5 Plus (OS Storage)

Case: 5000D AIRFLOW

Cooler: Thermalright Frost Commander 140

PSU: Corsair RM850e

 

PCPartPicker List: https://uk.pcpartpicker.com/list/QYLBh3

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×