Jump to content

Motherboard, or CPU? Which is causing the issue?

Okay so ive been posting on here quite a bit the past few days about an ongoing issue, but after some of the replies im not sure what to think anymore, for where the issue arrises.

I'm not going to leave anything out, and going to go over the whole timeline of events so you get the full context of what has happened and what i've done so far.

 

So the issue started just over a week ago when i noticed my PC fans were being very loud even when at idle, and i opened up HWINFO to see temps were in the high 60s even at idle, and it was throttling on even the most basic of games. Other than this, the PC was working flawlessly. I opened this thread for advice

In that thread, it seemed the overwhelming majority of people pointed the finger at a faulty MSI AIO.

 

So, i got a refund for it under warranty, and ordered a ThermalRight Peerless Assassin to replace it.

 

A few days later, the air cooler arrives, along with a thermalright contact frame i ordered.

I got to work replacing the cooler, taking out the old cooler, removing the normal ILM around the CPU socket (while the CPU was still in the socket, it never left the socket) and placing in the contact frame, cleaning and applying thermal paste as i went as needed, with isopopyl. I then attached the CPU cooler and plugged my PC back in.

 

Unfortunately, the GPU wasn't outputting to my screens, and i was only getting an output on the screen i have plugged in to the intergrated graphics (I use both, as one of my monitors is just for steam and Discord, so the iGPU is more than adequete for that. Plus, i have more screens than ports on my GPU).

At this point, i went on the /r/techsupport discord for help, and while i was doing that, i looked in device manager just to be sure, and sure enough my RX 6800 XT was not listed anywhere. I did notice one oddity though, and it stood out to me in the device manager because it mentioned PCI (see attached image below). Though its just a "Driver not found" error, (Code 28) and for all i know, it was there for a long time before this. Still it stood out to me.

sddasd.png

I was told to re-seat the GPU and try again. When i did so, when i next turned on the PC it proceeded to boot...and then shut off, turn on again....A couple of times, before remaining on. I was not even getting an output from the Intergrated graphics then either. I got no error beep codes, and none of the debug LEDs remained lit for more than a second. Its entirely possible it was getting to windows, but for whatever reason, wasnt outputting to the screens.

 

At that point, i began breadboarding. I took out the GPU, and all but one stick of ram and booted it up, and it got to windows just fine then. I then proceeded to test every sick of ram i had in that single dimm slot, and then a known good RAM stick in each slot, all were working. I then re-populated all 4 slots.

 

I then put the GPU back in, but this time i put it in one of the lower slots, the ones that have less active lanes. And huzzah! I got an output from the GPU! ....But unfortunately, this obviously wouldn't be a long term solution.

I then put the GPU back in to the top slot, and this time, instead of booting and rebooting itself, i got to windows...But still no output on the GPU.

We never figured out what the boot looping was about but it only happened that one time in between moving hardware around, so it may have been a fluke. Who knows?

The /r/techsupport subreddit discord concluded that the top PCIE slot was somehow damaged, faulty or otherwise. No worky. That seemed perfectly logical to me, and it reminded me of the LTT video "All our data is gone" where the PCIE was bad,  though at no point while changing the CPU cooler did i ever move the GPU outside of exceptionally like wiggles as i moved my arms around, nothing enough to damage it. The only thing i can think of what caused it was me plugging the PC back in - When i plugged the PC back in, the power cord was still live, plugged in to the wall, but the PSU itself was off. Perhaps a power surge? Unlikely, as other places in the world dont have switches on the wall sockets like where i live so i would imagine it would be a risk there if nothing else....But im honestly not sure. I mentioned this to /r/techsupport but i got no comment.

 

So, i naturally started to look at replacing the Motherboard, ideally one of the same or similar model and i posted this thread asking about it

 

However in this thread, someone suggested, after me explaining it, and after seeing the image i also attached here, that it may be the CPU, and not the motherboard that is at fault and it gave me pause. After all, i was working around the CPU when i replaced the cooler, not the PCIE slot and its left me a bit confused as to what it could be. They mentioned that somehow, the PCIE lanes on the CPU may have been damaged and the attached image did make them suspect this. However it is worth noting that the nvme ssd i have installed is still working fine. It's just the GPU, when plugged in to the top slot, thats not detected.

I have no other CPUs or motherboards i could use to test the other. I do have an old HD 6970 from 2012, a 12 year old GPU, but considering my current card was outputting when being placed in a lower PCIE slot, i....doubt the GPU is at fault? I've mostly dismissed it as a potential culprit at this point but i honestly have no idea.

 

I ran Cinebench to test the CPU, see if it would crash, to see if that might point to something. However it did not - Though i did notice that it was immediately thermal throttling when at 100%, and mentioned this in the first thread i linked here, as it was still the same topic more or less. Though i think that may have just been because i put on a bit too much thermal paste, and instead of installing both fans on the CPU heatsink, i instead placed one an inch away as a case exhaust, pulling air through the other half of the heatsink tower....Otherwise idle temps were only about 5ish degrees higher than graphs i could find online, and the new cooler was a good 20 degrees cooler at idle than the old AIO, but without a GPU i cant test temps while gaming unfortunately, and this may have just been a bad thermal paste application? No clue. so...I can only assume this issue isn't related to the GPU being not detected? Just thought i'd mention it though, as i said i would lay everything down.

 

As an asside, i found several "Metadata staging failed" errors in the event viewer, under the source of "DeviceSetupManager", whenever i turn on my PC. I have no idea if its related or not, but again just thought i'd mention it.

 

So yea....

That's basically where i'm at right now. A bit confused as to if its the motherboard or the CPU. Or even something else entirely.

I bought both the CPU, and motherboard at the same time on amazon in november of 2022, and im not sure about my prospects of getting a refund of both, especially if i ask for a refund at seperate times. I was lucky enough getting the refund for the CPU cooler....

So basically i think i just need some fresh eyes on the problem that might help me muddle through all the facts and make some informed guesses as to which is the most likely culprit.

 

Should i just go with my original plan of replacing the motherboard and seeing if anything changes? Though as mentioned, im concerned that if i do that, i may not be able to get a refund on the CPU if that is the culprit.

Thanks in advance!

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×