Jump to content

Laptop: Dell Precision M4500 with Intel Core i7-740QM, 8GB RAM, Quadro FX 1800M, 240 GB SSD

Fully up to date Arch Linux, proprietary Nvidia driver version 340.108 provided by nvidia-340xx-dkms in the AUR

 

A couple weeks ago, my laptop overheated and shut down three times: once while setting up a VM and twice while playing Minecraft. Since there was already basically no dust, I repasted it (both the CPU and GPU, as they share a cooler), and the results seemed good. Weeks before the repaste, the CPU and GPU idled at 60-65 and went up to 75-85 under load, only rarely hitting 90. In the days leading up to the repaste, I was regularly seeing 90-95 on both the CPU and GPU under load. After the repaste, idle temps dropped to the 55-60 range and temps under load rarely touched 75. Additionally, the laptop now runs the fan at slower speeds and less frequently, and the keyboard, trackpad, palmrest, speaker grills, and bottom panel are noticeably cooler. However, despite this, these issues that I'm having seem to have started after the repaste. I have had no such problems before, and I had not made any changes to the computer other than routine software updates that did not involve my kernel, Xorg, or my GPU driver, and my repaste. My knowledge about computers tells me that repasting won't break anything unless I I crack the mobo or the dies (which are exposed on my machine), send static into something, or do something similarly catastrophic, in which case the machine would be completely dead. However, the correlation seems to be there, so I'm bringing it up in case it matters.

 

So far, the laptop has refused to wake up from sleep and had to be rebooted twice. The first time, the wi-fi LED came on and the fan spun up, but the screen did not come on, and it took several attempts to reboot because I had backlight but no video in the BIOS on the first few tries. The second time, the power LED stopped blinking and came on solid, but absolutely nothing else happened.

 

Additionally, on two separate occasions it froze and started displaying artifacts at the exact moment that I lifted it up until I rebooted it. The pattern was the same both times, but the first time it covered the entire screen and the second time it was only in a few spots. I have attached a camera recording of a small portion of the screen the second time. There was sensitive information on the screen, so I could not record the whole thing.

 

So does this mean my GPU is dying, or is there something else that could be causing this? And if it is the GPU dying, is it possible to make Linux do something reasonable like re-initializing the GPU as a workaround every time it poops itself instead of forcing me to reboot the whole system, or is this outside the control of the OS and/or the capabilities of Nvidia's shitty driver?

Link to comment
https://linustechtips.com/topic/1236060-is-my-gpu-dying/
Share on other sites

Link to post
Share on other sites

DDU is a Windows utility (we don't need that kind of stuff on Linux because we have package managers), but I'll try a driver reinstall anyways. Same idea.

 

I guess now I'm wondering if it is possible to make Linux do something reasonable like re-initializing the GPU as a workaround every time it poops itself instead of forcing me to reboot the whole system, or is this outside the control of the OS and/or the capabilities of Nvidia's shitty driver?

 

Also, how would I make my next machine last longer? I don't want to have to replace my machines until they are obsolete.

Link to comment
https://linustechtips.com/topic/1236060-is-my-gpu-dying/#findComment-13936890
Share on other sites

Link to post
Share on other sites

2 minutes ago, Zm1TDkSnQkY4KEqskCARSBpk said:

Any way to tell which? If it's just the display cable, I can replace it, but a dead GPU requires a replacement mobo and therefore I believe it totals the laptop.

wiggle the cable?

-sigh- feeling like I'm being too negative lately

Link to comment
https://linustechtips.com/topic/1236060-is-my-gpu-dying/#findComment-13936910
Share on other sites

Link to post
Share on other sites

Just wiggling the cable does nothing. Completely disconnecting and reconnecting the cable causes the built in screen and lid sensor to not work until a reboot, and also seems to prevent the laptop from outputting to an external display if I connect one after the fact, but the hard drive and wi-fi lights remain operational, so I don't think it brings down the whole system.

Link to comment
https://linustechtips.com/topic/1236060-is-my-gpu-dying/#findComment-13936934
Share on other sites

Link to post
Share on other sites

If I completely disconnect and reconnect my built in display while an external monitor is connected, the laptop still thinks the screen is there once it's unplugged, and it is reset to full brightness but otherwise continues to operate normally once it is reconnected. So with this exercise, we've discovered that:

1. My screen cable is probably fine, so it is probably my GPU.

2. My laptop does not support hotplugging the internal display. (I wonder why /s)

3. Something about my laptop, most likely the graphics software stack, really hates not having any display connected at all.

 

For the next few weeks, I'll operate my laptop exclusively with dual monitors (VGA + Internal), and if problems occur only on the built in screen, then I guess that means the problem has something to do with the display circuitry downstream of the GPU, and if it affects both screens, it's probably the GPU.

Link to comment
https://linustechtips.com/topic/1236060-is-my-gpu-dying/#findComment-13937012
Share on other sites

Link to post
Share on other sites

7 hours ago, Zm1TDkSnQkY4KEqskCARSBpk said:

Also, how would I make my next machine last longer? I don't want to have to replace my machines until they are obsolete.

Well, your laptop is 10 years old at this point. That’s a pretty great run for a notebook especially considering it has a discrete GPU.

MacBook Pro 16 i9-9980HK - Radeon Pro 5500m 8GB - 32GB DDR4 - 2TB NVME

iPhone 12 Mini / Sony WH-1000XM4 / Bose Companion 20

Link to comment
https://linustechtips.com/topic/1236060-is-my-gpu-dying/#findComment-13937758
Share on other sites

Link to post
Share on other sites

I know it's 10 years old, but the screen, keyboard, and trackpad are still better than ones I see on many new machines today, it's still very fast, and it just generally holds up very well, so I do not consider it to be obsolete yet. This leaves me with a few questions:

 

1. When buying a new machine, is there any way to predict reliability other than brand reputation?

2. Could changing the thermal paste have possibly broken it, or is it just a coincidence?

2. Will a dead GPU continue to give me problems if I uninstall the GPU driver and use my machine as a server, possibly with the display removed?

Link to comment
https://linustechtips.com/topic/1236060-is-my-gpu-dying/#findComment-13938317
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×