Jump to content

PCIE Errors out the butt

Go to solution Solved by Atmos,

I'm going to move backwards 8 feet and cast Create Undead on previous thread.

 

Update:

Turns out my 1 issue, is/was actually 3! Yay!

 

  1. My power supply took a beating apparently during the outages in my area last couple years. It is on the brink of failure, so replacement acquired. RM750 bought on black friday for a solid deal. This was responsible for some of the BSODs and the game crashes.
  2. My 1080 did indeed have water damage. Removing the backplate and re-examining the pcb revealed further liquid damage than I originally recalled. It has since been retired and replace with a 2080 super. This was responsible for driver crashes.
  3. My new 16gb kit of 3200mhz Predator rgb is also non-functional. 9% failure rate on 4 passes of memtest86. Failures across the board on test # 9, modulo 20. Temporary replacement (thanks walmart) of 3000mhz lpx while I wait for my predator RMA to go through. This was responsible for the infrequent Memory_management BSODs I was getting likely.

Determined the CPU is functional. Prime95 run for 2 hours ruled out anything but ram. Memtest86 only further proved that.

Well. It could be motherboard. That would be a bitch to deal with RN because microcenter is like a 2 1/2hr drive, and I'm not buying a temp mobo. I'll update again in a while once I rebuild this system with whether or not it was the motherboard for future onlookers.

 

 

Alright. So, pretty sure ive reached the proper conclusion, but just want to sanity check myself before i completely resign any kind of hard gaming for the next month.

 

Right so. Previous system core components where

Spoiler

4790k

asus z97-(something)

EVGA 1080 FTW (the problem child)

Antec HGC 520w

 

13 Months ago iirc my first evga 1080 blew up. luckily within the last month of its warranty, so I rma'd it and got a replacement from them.

About two months after receiving the replacement it starts crashing out of games. Knowing my luck, the warranty had ended the previous month so RMA is no longer an option :) . Underclocking it by about 120mhz seemed to fix the issues pretty much. Didnt really care because even underclocked it was benchmarking at above average performance for 1080s. Crashes would typically state something along the line of "graphics driver failure" or "graphics device driver stopped working". When I disassembled the card first when i started noticing issues with it I did find some minor liquid damage marks, but there appeared to be no corrosion, and it was in a fairly discrete section of the back of the PCB away from board components.

 

Fast forward to the last few days.

 

System upgrade time so streaming and video editing isnt as painful anymore. 3700x, 16gbs hyperx predator, b450 aorus, 1440p 144hz display, and a fresh install of windows on a new kingston bargain bin ssd.

System instability is extreme in games now. BSOD'ing and crashes to desktop every hour or so. Event viewer is filled with PCIE errors. Literally hundreds upon hundreds per-day, when the system is idle, when it is gaming, constantly pcie errors. Underclocking does not change the frequency of crashes anymore.

 

  • Ived checked that temps are not too high, highest recorded temps on cpu are 61c, gpu 72c while synthetic benchmarking.
  • Ive run windows memory diagnostic and the ram checks out fine with no issues.
  • Ive reinstalled and updated windows to the latest version, which seems to have fixed the BSODing, as i suspect that was the old windows 10 install i used that lacked complete support for ryzen 3000.
  • Ive reinstalled the drivers, and have worked through two separate driver versions now.
  • Ive ensured all sata connections, and all power cables are affixed properly
  • Ive ensured the PSU is not overheating
  • I have no tested the system with a separate power supply although from what I understand the HGC from antec was a very solid psu and scored quite well in reviews.

From my experience and general knowledge this appears to be a component level failure on the gpu... right? Thats what everything is pointing to right? I suppose i could try another power supply, but that would mean buying something I would want to replace my current one with, and upgrading from what I have is just going to be another 130$ going into this upgrade. I reckon that the system upgrade and more demanding monitor have pushed the gpu a bit too hard in its already weakened state and now its just unstable af... right? Im not insane right?

Updated 2021 Desktop || 3700x || Asus x570 Tuf Gaming || 32gb Predator 3200mhz || 2080s XC Ultra || MSI 1440p144hz || DT990 + HD660 || GoXLR + ifi Zen Can || Avermedia Livestreamer 513 ||

New Home Dedicated Game Server || Xeon E5 2630Lv3 || 16gb 2333mhz ddr4 ECC || 2tb Sata SSD || 8tb Nas HDD || Radeon 6450 1g display adapter ||

Link to comment
https://linustechtips.com/topic/1126285-pcie-errors-out-the-butt/
Share on other sites

Link to post
Share on other sites

One of the PCIE errors as reported by event viewer in general and full, since I get no other information from windows or games on them anymore

Spoiler

+ System 

  - Provider 

   [ Name]  Microsoft-Windows-WHEA-Logger 
   [ Guid]  {c26c4f3c-3f66-4e99-8f8a-39405cfed220} 
 
   EventID 17 
 
   Version 1 
 
   Level 3 
 
   Task 0 
 
   Opcode 0 
 
   Keywords 0x8000000000000000 
 
  - TimeCreated 

   [ SystemTime]  2019-11-22T07:55:05.421146900Z 
 
   EventRecordID 2826 
 
  - Correlation 

   [ ActivityID]  {1f9affc9-3df2-49c3-b1b9-1159a4a98f02} 
 
  - Execution 

   [ ProcessID]  4500 
   [ ThreadID]  3012 
 
   Channel System 
 
   Computer Hermes-II 
 
  - Security 

   [ UserID]  S-1-5-19 
 

- EventData 

  ErrorSource 4 
  FRUId {00000000-0000-0000-0000-000000000000} 
  FRUText  
  ValidBits 0xdf 
  PortType 1 
  Version 0x101 
  Command 0x10 
  Status 0x7 
  Bus 0x6 
  Device 0x0 
  Function 0x0 
  Segment 0x0 
  SecondaryBus 0x0 
  SecondaryDevice 0x0 
  SecondaryFunction 0x0 
  VendorID 0x10de 
  DeviceID 0x1b80 
  ClassCode 0x18000 
  DeviceSerialNumber 0x0 
  BridgeControl 0x0 
  BridgeStatus 0x0 
  UncorrectableErrorStatus 0x0 
  CorrectableErrorStatus 0x1000 
  HeaderLog 00000000000000000000000000000000 
  PrimaryDeviceName PCI\VEN_10DE&DEV_1B80&SUBSYS_62863842&REV_A1 
  SecondaryDeviceName  

 

Spoiler

A corrected hardware error has occurred.

Component: PCI Express Legacy Endpoint
Error Source: Advanced Error Reporting (PCI Express)

Primary Bus:Device:Function: 0x6:0x0:0x0
Secondary Bus:Device:Function: 0x0:0x0:0x0
Primary Device Name:PCI\VEN_10DE&DEV_1B80&SUBSYS_62863842&REV_A1
Secondary Device Name:

 

 

Updated 2021 Desktop || 3700x || Asus x570 Tuf Gaming || 32gb Predator 3200mhz || 2080s XC Ultra || MSI 1440p144hz || DT990 + HD660 || GoXLR + ifi Zen Can || Avermedia Livestreamer 513 ||

New Home Dedicated Game Server || Xeon E5 2630Lv3 || 16gb 2333mhz ddr4 ECC || 2tb Sata SSD || 8tb Nas HDD || Radeon 6450 1g display adapter ||

Link to post
Share on other sites

 

CPU: Ryzen 5800X3D | Motherboard: Gigabyte B550 Elite V2 | RAM: G.Skill Aegis 2x16gb 3200 @3600mhz | PSU: EVGA SuperNova 750 G3 | Monitor: LG 27GL850-B , Samsung C27HG70 | 
GPU: Red Devil RX 7900XT | Sound: Odac + Fiio E09K | Case: Fractal Design R6 TG Blackout |Storage: MP510 960gb and 860 Evo 500gb | Cooling: CPU: Alphacool ST30 420mm rad, Alphacool CPU and GPU Core LT and Core blocks, D5 pump and res combo 

Link to post
Share on other sites

38 minutes ago, DoctorNick said:

-snip-

So, again its reporting that the 1080 driver keeps erroring but being corrected.

So... it is the card since the only component between the two systems that the driver has in common is the card at this point.

Updated 2021 Desktop || 3700x || Asus x570 Tuf Gaming || 32gb Predator 3200mhz || 2080s XC Ultra || MSI 1440p144hz || DT990 + HD660 || GoXLR + ifi Zen Can || Avermedia Livestreamer 513 ||

New Home Dedicated Game Server || Xeon E5 2630Lv3 || 16gb 2333mhz ddr4 ECC || 2tb Sata SSD || 8tb Nas HDD || Radeon 6450 1g display adapter ||

Link to post
Share on other sites

  • 2 weeks later...

I'm going to move backwards 8 feet and cast Create Undead on previous thread.

 

Update:

Turns out my 1 issue, is/was actually 3! Yay!

 

  1. My power supply took a beating apparently during the outages in my area last couple years. It is on the brink of failure, so replacement acquired. RM750 bought on black friday for a solid deal. This was responsible for some of the BSODs and the game crashes.
  2. My 1080 did indeed have water damage. Removing the backplate and re-examining the pcb revealed further liquid damage than I originally recalled. It has since been retired and replace with a 2080 super. This was responsible for driver crashes.
  3. My new 16gb kit of 3200mhz Predator rgb is also non-functional. 9% failure rate on 4 passes of memtest86. Failures across the board on test # 9, modulo 20. Temporary replacement (thanks walmart) of 3000mhz lpx while I wait for my predator RMA to go through. This was responsible for the infrequent Memory_management BSODs I was getting likely.

Determined the CPU is functional. Prime95 run for 2 hours ruled out anything but ram. Memtest86 only further proved that.

Well. It could be motherboard. That would be a bitch to deal with RN because microcenter is like a 2 1/2hr drive, and I'm not buying a temp mobo. I'll update again in a while once I rebuild this system with whether or not it was the motherboard for future onlookers.

 

 

Updated 2021 Desktop || 3700x || Asus x570 Tuf Gaming || 32gb Predator 3200mhz || 2080s XC Ultra || MSI 1440p144hz || DT990 + HD660 || GoXLR + ifi Zen Can || Avermedia Livestreamer 513 ||

New Home Dedicated Game Server || Xeon E5 2630Lv3 || 16gb 2333mhz ddr4 ECC || 2tb Sata SSD || 8tb Nas HDD || Radeon 6450 1g display adapter ||

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×