Jump to content

Trying to figure out the problem with this laptop again.

So some of you might remember that refurbished HP Elitebook G5 (Coffee Lake) that was bought for my sister for college but it ended up having constant bluescreens/crashes. After it come in my hands, the hands of a highly experienced Tech Enthusiast of course, I found out that disabling Turbo Boost and hence locking the processor to exactly 1.69 GHz(nice) solved the crashing issue. But of course handicapping the processor is not a valid solution. Right now, I use a makeshift where I have Turbo Boost disabled in the firmware (so that Windows can boot), but I can enable it dynamically in Windows using programs like ThrottleStop. Of course to not crash it, I can lower down the PL2 to where it is stable. But finding that stable spot is hard as it can scam you anytime and lowering the number each by each also hurts. Yesterday it was fine at 20 watts and above but today it crashed even at 18 watts. But there are 2 things I noticed which might give a clue of what exactly is the problem and if there is any hope of fixing it.

 

1) Got a 65 watt adapter right now. I have no idea how it manages the power draw I am going to mention in that power budget. The default PL2 is I think 45 watts. That leaves 20 watts for the whole system, but it doesn't end there. The I-GPU can take an extra from the CPU package power. I ran Furmark and CPU burner at the same time, and for my astonishment, this highest recorded CPU package power by ThrottleStop was 53 watts! It still doesn't end there. At that power, the package was thermal throttling. So, it could have gone even further than that. I also only ran it for just a couple of seconds as I was tired of it crashing all the time. Upon that, I had the screen brightness set to max, had mouse, ethernet and earphones plugged in. There is keyboard lighting as well. The motherboard, chipset, 2 RAM modules, and the SSD also have to be powered. Even the fan was spinning. How was there enough power to handle all this? Can it overdrive more than 65 watts for a small time, or does battery also help a bit in delivering the power? I have no idea how power management works in laptops.

 

2) I had a suspicion that the VRMs might be the issue or the minor power delivery devices like capacitors on the motherboard. But maybe I was wrong? I stress tested the I-GPU while having CPU turbo disabled, and the package happily took almost 30 watts of power without crashing, or at least not crashing for the time interval I tested. I will test more on this, but I think I had waited enough time. So, the problem is not the motherboard, but the CPU not being able to handle that much power?

 

I am going to take it to a repair store (NOT because I can't troubleshoot it myself, but I am not allowed to do so myself) for better troubleshooting. There is that dual channel memory with 2 different manufacturers which still raise a red flag. Will take a closer look at the motherboard.

 

GNU/Linux, being completely superior than Windows, is much stable. It crashed on the one time I tested on Debian (stress testing without power limits), and it didn't crash at all the twice I tested On Arch, once while running KDE plasma as well. But I think I wasn't using a good stress tester on Arch and the package power wasn't locked at 45 all the time.

 

Some more things that I am curious to see if they work is applying more voltage to the CPU. Because you never know, maybe it is because voltages. But sadly, the firmware doesn't allow the changes, or at least, in ThrottleStop it is disabled. Intel XTU is unsupported but was trying to find a version which could support it. Are there any other programs which someone knows to alter voltages? Maybe in Linux?

 

Please don't tell me there is that one single transistor shifted 0.5 nanometres or that one power line conflicting with a data line 🤦‍♂️.

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Gat Pelsinger said:

There is that dual channel memory with 2 different manufacturers which still raise a red flag.

i probably would have started there. if the bluescreens are the typical "IRQ less than equal" style ones then the memory or board being bad is more likely than the cpu.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, emosun said:

i probably would have started there. if the bluescreens are the typical "IRQ less than equal" style ones then the memory or board being bad is more likely than the cpu.

They are not particularly memory related. I ran memory tests and they all passed. Pretty sure it is either motherboard power delivery problem or a problem in the CPU.

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Gat Pelsinger said:

I ran memory tests and they all passed.

which means nothing. a memory test just confirms the memory is working during that specific memory test. which means nothing if the machine does something other than a memory test.

case and point , i don't think ive ever seen a computer fail a memory test despite having hardware problems with the memory. I basically never run the test anymore because there's no reason to. it's an incredibly useless test.

taking out memory sticks is a free diagnostic so it's really interesting when i come across people that are 100% against trying it. can't really ever make people do anything so i gotta get over it.

Link to comment
Share on other sites

Link to post
Share on other sites

@emosun

 

Ok wait, I have some clues from which I really think it could be the memory.

 

First of all, I am beginning to lose in believe that there is some power delivery problem. I ran prime95 for a lot of minutes and it surprisingly didn't crash. It goes to 45 watts and then starts thermal throttling to 30 watts. It doesn't crash that easily on Prime95 (but it did crash sometimes for when I tried to open Furmark). This doesn't entirely render the power delivery as no problem though. It could be that sustained loads do fine but sudden spikes cause the crashes. For example I opened Edge after completing the Prime95 stress test and it crashed. But still, I think even Prime95 can also be considered as a sudden spike when starting the test so no idea.

 

But I think the RAM really could be the problem, or to be more specific, the processor is not able to handle the RAM. In HWinfo, in the memory tab, if I switch between the 2 timings that are given (MC #0, CH#0 and MC #0, CH#1), there are slight changes in some values such as RTL and tWRRD_dg, whatever that means. I have one more clue that it could be the RAM is because again, in Prime95, it didn't crash at all because I was running that in small FFTs, so there is no memory I/O, but it did easily crash when I stress tested the memory controller, and also ran the integrated TS Bench in ThrottleStop which is also stresses memory I think, and it does crash easily. That could also explain why it immediately crashed when opening Edge, because of memory I/O at high clocks. But that doesn't explain why it doesn't crash when I-GPU underload, when there has to be mass memory I/O. So this is probably only a CPU memory issue. 

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

I'd suggest removing 1 stick of ram then torture the poor laptop with memory prime 95 again. It'll either run or crash but turn it off and swap the ram sticks around and run it again.

Link to comment
Share on other sites

Link to post
Share on other sites

@emosun @AndersWSP

 

Well I did it myself. There are 2 modules and 2 slots so 4 possible single channel configurations. All of them crashed. There is no way it can be memory. I took a brief look at the motherboard and I can't see any defect. There has to be some problem in the CPU or the motherboard sadly. Still some things to try though. I am thinking about removing the battery and only use AC power. I have never done this in new laptops so just needed confirmation if that's ok and safe. If that that doesn't work, then some deep troubleshooting is required like removing most of the non-critical components and testing again.

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Gat Pelsinger said:

@emosun @AndersWSP

 

Well I did it myself. There are 2 modules and 2 slots so 4 possible single channel configurations. All of them crashed. There is no way it can be memory. I took a brief look at the motherboard and I can't see any defect. There has to be some problem in the CPU or the motherboard sadly. Still some things to try though. I am thinking about removing the battery and only use AC power. I have never done this in new laptops so just needed confirmation if that's ok and safe. If that that doesn't work, then some deep troubleshooting is required like removing most of the non-critical components and testing again.

Yep, you're free to use AC only and remove the battery. In fact my mom's old laptop had a blown battery. Plugged into wall power while charging made the pc awfully slow. It technically worked but even moving the mouse around had several seconds of delay. Removing the battery had it running about as well as it could.

Link to comment
Share on other sites

Link to post
Share on other sites

@AndersWSP

 

Well I tried that (removing battery and only using AC) but I was heavily power limited. Even with turbo enabled, in load it hardly got over 1.5 GHz. So yes, of course it didn't crash but that is not the solution. Was there anything I could do? If it really could be the battery then I need to try with a different battery.

 

edit - I also tried 2 different power adapters of HP and with turbo enabled and disabled.

Microsoft owns my soul.

 

Also, Dell is evil, but HP kinda nice.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×