Jump to content

In need of help troubleshooting strange GPU issue.

Guest
Go to solution Solved by Heliian,
2 hours ago, Revyn said:

Huh.. I hope you don't mind me asking, but how could removing just the fan shroud from a GPU result in needing to repaste within a day?

I worry that this may somehow not be the issue and I may end up with an overheating GPU even after repasting..

My guess is that you disturbed the mounting of the cooler when you pulled the fan and the 5 year+old paste was dried out.  

 

Specs:

 

CPU: Ryzen 5 3600 (used, 2 years old- in use by me for less than a month)

RAM: 2x8GB DDR4 HyperX Fury (in use for nearly 5 years)

Motherboard: AsRock B550 Steel Legend (new + in use for less than a month)

GPU: ASUS ROG Strix RX480 8GB (in use for nearly 5 years)

PSU: Seasonic Focus GX850 (new + in use for less than a month)

OS: Windows 10, 64bit

 

 

Introduction(ish), what led to the issue:

 

So, today I realized that my GPU actually had RGB on its fan shroud. I hadn't seen it working for years though (or, at least for a long enough time that I've completely forgotten if it ever worked). After looking into the issue, it seems that the fan shroud only has two RGB LEDs (which worked fine and which I could see were lit on the right side of the GPU), and the actual light you see from the outside is conducted through a sort of fiber optics "cable". A video I watched about this demonstrated how to replace these, and that the issue was due to discoloration (presumably due to age/heat) in the light conductors. (Here's the vid: https://youtu.be/9dlRzbT4IJw)

 

As the video showed that removing the fan shroud was as simple as removing 6 screws and unplugging the cable that powers the RGB lighting, I figured I could just try and move the light conductors closer to the LEDs and see if that helped- rather than buying entirely new conductors. I took my GPU out of my sytem, took off the fan shroud, shifted the light conductors around and moved them closer to the LEDs. Put the shroud back on, mounted the GPU back in and booted up my PC. Everything seemed to work fine, although fiddling with the light conductors didn't seem to have helped much. Unfortunate, I thought, but whatever.

 

 

The actual issue:

 

(I originally wanted to write down in chronological order what happened and how my perception of the issue changed, however my memory isn't great and I've forgotten the details. Most of what I tried doesn't seem to be relevant to the actual issue, although I have noted it down below)

 

Now, anytime I try to launch anything GPU-demanding, temps skyrocket along with fan speeds (audibly), until temps reach around 90(+)C, at which point if I don't close whatever is using the GPU, my PC shuts off within seconds. I presume this a safety precaution considering the GPUs max temp is 90°C.

 

I can actually see the fans ramping up (MSI afterburner open on second screen) as well as hear them ramp up. The temperature curve even shows a considerable drop in temperature after the first initial temp spike, however it then spikes again once fan speeds drop. At this point(when reaching the temp spike), it crashes. No artifacts, nothing. Just a blackscreen.

After toying around (mostly with the power limit so I can run the GPU at full load and compare temperatures with what I had before), it seems like it runs 10 to 15, maybe even 20 degrees hotter. E.g. I set the fan speed to 100%, launched Valheim and within half a minute of just sitting on the main menu screen, it hits 90C, up to 91 (at which point I closed the game again). This definitely isn't normal and I'm absolutely puzzled by it.

 

 

What I have tried to resolve this:

 

  • Disabled any OCs I had applied to CPU/RAM, just in case.
  • Tried reseating the GPU as well as checking if any of its fan cables were getting squeezed by the fan shroud.
  • Looked over the GPU for any potential damage, especially parts of the PCB that were exposed. Found nothing.
  • Made sure the PCI-E power cable was properly plugged in. The 6+2pin plugs are latched in and the +2 plug is flush with the 6pin plug.
  • Full driver reinstall using DDU (Originally thought there was fan control issues), then a..
  • "Simple" driver reinstall: Uninstalled AMD Software through Windows settings, installed recommended version from AMDs website(above, I had installed the newest version), then..
  • Uninstalled MSI Afterburner, then installed it again. At this point, I am certain this can't (shouldn't?) be a software-side issue with temperature detection and/or fan control.

 

 

Other things that may be of note:

 

  • I have tried changing my fan curve in MSI Afterburner, which didn't help. I have also had this current fan curve for upwards of a year without any issues.
  • This GPU has been working fine for nearly five years. I have never had comparable issues, only crashes that were driver-related. My temps would always stay at around 75 degrees under load, and I had intentionally set my fan curve so that my GPU would run a bit hotter but remain more silent.
  • My system in this configuration has been working fine for the last two weeks. I had some crashes 2-3d after my upgrade as the GPU fans wouldn't kick in at all. This, I believe, was thanks to MSI Afterburner, and was no longer happening.
  • One of the fins(?) on the GPU heatsink is slightly bent at its tip. I upgraded my computer on the 25th and noticed it at that point. I believe it may have been there before my upgrade, although I can't be sure. Though surely there is no way this could have caused this.
  • I made sure to ground myself before and while working on the GPU.
  • The fan shroud is not near/attached to the PCB, and I doubt there could have been physical damage to it as a result of taking it off. The GPU fans are powered via a separate cable that is not attached to the fan shroud, and did not need removing. There also isn't anything caught/stuck in the fans. They all spin and I can hear them ramp up under load.
  • Removing the fan shroud did not involve removing the GPU heatsink or backplate. There was no work done to the backplate or heatsink, aside from screwing the fan shroud back into the heatsink.

 

 

At this point, I'm out of ideas. I have tried looking this up but could not find anything so far. I'm quite confident this can't be due to the fans themselves, as they sound, look and seem to be performing (according to MSI Afterburner/Radeon Settings) the same as before. It seems as if something's up with heat transfer between the actual die and heatsink, which baffles me as (as mentioned) the heatsink wasn't removed from the PCB at all. I didn't loosen the screws holding it.

 

Please let me know if there is anything else I may have missed, whether it's potential fixes or something that needs testing. Thanks.

Edited by Revyn
Added resolution attempt (overclocks)
Link to comment
Share on other sites

Link to post
Share on other sites

You need to repaste and remount the cooler.   This may entail paste and pads.  

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

14 minutes ago, Heliian said:

You need to repaste and remount the cooler.   This may entail paste and pads.  

 

 

Huh.. I hope you don't mind me asking, but how could removing just the fan shroud from a GPU result in needing to repaste within a day?

I worry that this may somehow not be the issue and I may end up with an overheating GPU even after repasting..

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Revyn said:

Huh.. I hope you don't mind me asking, but how could removing just the fan shroud from a GPU result in needing to repaste within a day?

I worry that this may somehow not be the issue and I may end up with an overheating GPU even after repasting..

My guess is that you disturbed the mounting of the cooler when you pulled the fan and the 5 year+old paste was dried out.  

 

Link to comment
Share on other sites

Link to post
Share on other sites

On 1/10/2022 at 6:03 AM, Heliian said:

My guess is that you disturbed the mounting of the cooler when you pulled the fan and the 5 year+old paste was dried out.  

 

Untightened the screws holding the heatsink to the PCB a little today and screwed them back in. Temps aren't fully back to normal but it can now run at full load and sits at about 82 degrees. So I suppose that confirms it. I had noticed the heatsink was wiggling a little while taking the GPU out as well.

I'm probably doing a proper repaste this week, and I'll use the opportunity to replace the RGB light conductors & thoroughly clean the heatsink. Thanks for the help.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×