Jump to content

Two GTX 1080 Tis Going Bad Exactly The Same Way At The Same Time?!?

GMart84

I have an older gaming PC that I have been using as a secondary in the house for years without issue.  Starting about two months ago it started having issues when running any kind of game/benchmark.  It will load into a game or benchmark and start running it, within about 30 seconds it BSODs with a Critical Process Died or WHEA Uncorrectable Error code.  System Specs:

 

i7 8700K (No OC)

Asus Maximus X Hero WiFi AC

32 GB Corsair RAM (XMP On)

240 GB NVME

500 GB SSD

2x GTX 1080 Ti (EVGA SC Black, No OC)

1000W Corsair HX PSU

Custom Watercooled

 

After the first BSOD, I went ahead and tried a different PSU, removing repasting/thermal padding both GPUs, running each GPU independently in the system, clean installs of Windows, base Windows nvidia driver, new nvidia driver, underclock on both GPU and VRAM, increased voltage 10%, decreased voltage 10%, vbios updated, MB bios updated, all drivers updated, and tried all together again.

 

Symptoms are exactly the same in each test, works for about 30 seconds then BSOD.  Interestingly enough OCCT can complete and hour test on both GPU and VRAM without a hitch.

 

What are the odds of two cards dying in the exact same way with the exact same symptoms at the same time?

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, GMart84 said:

I have an older gaming PC that I have been using as a secondary in the house for years without issue.  Starting about two months ago it started having issues when running any kind of game/benchmark.  It will load into a game or benchmark and start running it, within about 30 seconds it BSODs with a Critical Process Died or WHEA Uncorrectable Error code.  System Specs:

 

i7 8700K (No OC)

Asus Maximus X Hero WiFi AC

32 GB Corsair RAM (XMP On)

240 GB NVME

500 GB SSD

2x GTX 1080 Ti (EVGA SC Black, No OC)

1000W Corsair HX PSU

Custom Watercooled

 

After the first BSOD, I went ahead and tried a different PSU, removing repasting/thermal padding both GPUs, running each GPU independently in the system, clean installs of Windows, base Windows nvidia driver, new nvidia driver, underclock on both GPU and VRAM, increased voltage 10%, decreased voltage 10%, vbios updated, MB bios updated, all drivers updated, and tried all together again.

 

Symptoms are exactly the same in each test, works for about 30 seconds then BSOD.  Interestingly enough OCCT can complete and hour test on both GPU and VRAM without a hitch.

 

What are the odds of two cards dying in the exact same way with the exact same symptoms at the same time?

Did you try with only 1 GPU ? Not sure SLI is still supported and working correctly on new games...

System : AMD R9 5900X / Gigabyte X570 AORUS PRO/ 2x16GB Corsair Vengeance 3600CL18 ASUS TUF Gaming AMD Radeon RX 7900 XTX OC Edition GPU/ Phanteks P600S case /  Eisbaer 280mm AIO (with 2xArctic P14 fans) / 2TB Crucial T500  NVme + 2TB WD SN850 NVme + 4TB Toshiba X300 HDD drives/ Corsair RM850x PSU/  Alienware AW3420DW 34" 120Hz 3440x1440p monitor / Logitech G915TKL keyboard (wireless) / Logitech G PRO X Superlight mouse / Audeze Maxwell headphones

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, PDifolco said:

Did you try with only 1 GPU ? Not sure SLI is still supported and working correctly on new games...

Yes, I tried it with each of the GPUs independently within the system.  Same error each time.  SLI isn't super supported but has been working reasonably until 2 months ago, and the heaven benchmark I believe supports it.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, WereCat said:

Have you tried to run some memory test to check if it's not your RAM causing issues? 

I am assuming you mean the 32 GB Ram from the base PC and not the VRAM, if that is the case, I have not run memtest on there, but that is a good suggestion, I will try that now.  Not sure why I forgot to do that.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

Whea Uncorrect is almost always Ram related BSODS, so i would double check the ram kit, maybe remove XMP and do a stress test to see if caused it 🙂 Then test 3000 mhz and such.

 

Im not surprised that a 6 year old card if its had heavy use would start to show signs of wear, especially if both have been subjected to the same workload, but it would be a bit odd if it was something specific like an issue with other components wearing out then causing them to degrade faster.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, BiG StroOnZ said:

I've heard that WHEA errors could be CPU related. Something worth investigating. It could also be memory or SSD/HDD related. 

Yeah this has been a puzzling one for me.  I have not yet replaced the NVME drive or SSD but have a couple new ones lying around that I could try.  In terms of the CPU, is it a full uninstall of the chip and look at it or something I should be able to test.  I have completed OCCT for over an hour on the CPU never getting above about 60 Degrees with it.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Shimejii said:

Whea Uncorrect is almost always Ram related BSODS, so i would double check the ram kit, maybe remove XMP and do a stress test to see if caused it 🙂 Then test 3000 mhz and such.

 

Im not surprised that a 6 year old card if its had heavy use would start to show signs of wear, especially if both have been subjected to the same workload, but it would be a bit odd if it was something specific like an issue with other components wearing out then causing them to degrade faster.

Non-XMP had an identical failure.  I will try testing without the 4 sticks next.

 

Your right about the 6+ year old card, I am just surprised that there were literally no warning signs of them going.  Everything ran perfectly literally up until the BSODs started.  THey occur on both Windows 10 and 11 as well, so I am beginning to think both cards are just bad.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, GMart84 said:

Yeah this has been a puzzling one for me.  I have not yet replaced the NVME drive or SSD but have a couple new ones lying around that I could try.  In terms of the CPU, is it a full uninstall of the chip and look at it or something I should be able to test.  I have completed OCCT for over an hour on the CPU never getting above about 60 Degrees with it.

 

Yeah might be worth trying one of those other drives. It could also be memory/RAM related, you could try individual sticks, different slots before replacing the kit entirely. Also running MemTest86 might streamline the process. But if you're hands-on it might be easier for you to test 1 stick at a time, and different slots, or a different kit completely.

 

As far as the chip goes, wouldn't hurt to see if there was any pin damage in the socket or any noticeable damage on the CPU. You might want to try other CPU stress tests like AIDA64 Extreme or Prime95. Even a loop of Cinebench R23 is pretty intensive. Temps do not necessarily indicate whether or not a CPU is damaged or not, it could have degraded already. Have you tried increasing VCORE on the CPU when stress testing? There could have been some slight degradation on the CPU too, and it might require slightly higher than stock VCORE to run stable.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, NastyFlytrap said:

Wait, so are you swapping between two 1080 Ti's or are you doing SLI?

If its SLI, then its enough if one piece in the chain is faulty, and that can cause bluescreens aswell, even without either of the cards being dead. Hell, i think its probably the SLI bridge doing it, or maybe the motherboard even, neither of those two last very long tbh

I have done both individual cards and in SLI.  I have one additional bridge I can try as well.  I had it running as SLI from the beginning and when the issues started, I decided to swap the cards out individually to see if one was good and the other bad.  Thank you for the suggestion on the SLI bridge, I had not thought of that either.  However since the symptoms are present when either card is installed individually, I am left with 4 potential causes that I am working on eliminating before just saying the cards are bad.

 

1. Faulty Memory - currently testing each stick individually on a bench with memtest86

2. Faulty CPU - About to disassemble PC to check pins and all that on the CPU

3. Faulty SSD - Once I test both of the above I will change out the NVME and SSD and see if that is the problem

4. Faulty Motherboard - I will test the cards in a different system once I get a water cooled bench setup for testing (this is the biggest pain about water cooling, trouble shooting is a pain)

 

If all else fails then I will come to the conclusion that the cards are bad, which I am hoping is not the case, but is seeming more likely.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, GMart84 said:

I have done both individual cards and in SLI.  I have one additional bridge I can try as well.  I had it running as SLI from the beginning and when the issues started, I decided to swap the cards out individually to see if one was good and the other bad.  Thank you for the suggestion on the SLI bridge, I had not thought of that either.  However since the symptoms are present when either card is installed individually, I am left with 4 potential causes that I am working on eliminating before just saying the cards are bad.

 

1. Faulty Memory - currently testing each stick individually on a bench with memtest86

2. Faulty CPU - About to disassemble PC to check pins and all that on the CPU

3. Faulty SSD - Once I test both of the above I will change out the NVME and SSD and see if that is the problem

4. Faulty Motherboard - I will test the cards in a different system once I get a water cooled bench setup for testing (this is the biggest pain about water cooling, trouble shooting is a pain)

 

If all else fails then I will come to the conclusion that the cards are bad, which I am hoping is not the case, but is seeming more likely.

when your testing the cards individually, are you testing them only in the top pci slot? if so, may be worth trying them both in the second pci slot just to rule out the top slot being defective.

Link to comment
Share on other sites

Link to post
Share on other sites

You mention this is a secondary system? Have you tried either card in your primary system to see if the same error occurs? Would seem to be the simplest initial test of each GPU.

Link to comment
Share on other sites

Link to post
Share on other sites

18 minutes ago, bmx6454 said:

when your testing the cards individually, are you testing them only in the top pci slot? if so, may be worth trying them both in the second pci slot just to rule out the top slot being defective.

Yes they have been tested in the top pcie slot.  Your suggestion is a good one.  I have 1660S that I could test slots with so I don’t have to remake tubes for the hardline water cooling just yet.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, DigitalGoat said:

You mention this is a secondary system? Have you tried either card in your primary system to see if the same error occurs? Would seem to be the simplest initial test of each GPU.

Given the amount of time I have spent troubleshooting at this point it may have been easier to drain the primary system and put this into it.   I really need to get a water cooled test bench to make this easier.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

Update:

 

Memtest ran for three hours with no failures on original RAM, 1660S in second slot worked like a charm, cpu socket looks good and cpu looks good as well.  Next is moving the water cooling system so that I can test the first slot with the 1660S.

BLACK and BLUE Build

i9-9900K - 5.2 Ghz @ 1.305 vCore, 32 GB Corsair Vengeance RGB Pro (@ 3200 Mhz), Gigabyte Aorus Z390 Extreme, EVGA RTX 2080Ti XC Ultra, Samsung 970 Pro, Samsung 970 EVO, Dual Custom Loop Cooling, Thermaltake Tower 900, AX1500i

 

VR Build

i7-8700K - 5.1 Ghz @ 1.36 vCore, 32 GB G.Skill TridentZ RGB (@ 3200 Mhz), Asus Maximus X Hero (Wi Fi ac), 2x EVGA GTX 1080Ti SC Black Edition, Toshiba NVME, Custom Loop Cooling, Thermaltake Core P5, HX1000i

 

FreeNAS Server Build

Pentium G5400, 8 GB Kingston HyperX Fury (@ 2400 MHz), Asrock H370M-ITX/ac, Intel 320 System SSD, 4x Hitachi 7200K 4 TB HDD, Thermaltake TR2 650W, Cooler Master Master Liquid Lite 120, Bit Fenix Prodigy

 

Daughter's First BuildCore i3-6100, 16 GB Corsair Vengeance LPX (@ 2133 MHz), Asrock H270M-ITX/ac, XFX RX-580 GTS, Custom Watercooling (Both CPU and GPU) 2x Corsair Force LS, PowerSpec 550w, NZXT H200i

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×