Jump to content

Long story short, my cousin sister got a refurbished HP Elitebook 840G (Intel 8th gen) for college, and it has been having bluescreens from the start. And now, I finally have the laptop here with me so that we can fix it here (she lives in a different city and we tried our best to raise complaints and repair there but we couldn't so we brought it here).

 

The problem is, it bluescreens randomly. "Whea uncorrectable error" be the BSOD code. A quick google search shows says it could be a deep software or a hardware issue, which isn't very helpful. I noticed one thing that it will BSOD when I connect to internet through Wi-Fi and try to browse the web, and it will crash for the first time. After that, it can crash like any time. It never crashes in the BIOS. My sister said she did try a windows re-install and it crashed when it was installing, so I don't think it is software, but one thing I will try is safe mode.

 

One thing I did see is that there is dual channel memory, which I absolutely freeking like it (average OEM user), but in the BIOS, I could see that one is a hynix memory, and the other is a samsung memory, which raises a lot of suspicion. The memory was changed out before for an attempt at troubleshooting. I haven't been able to check the timings yet, because it just crashes so suddenly before I can install CPU-Z. I did try the inbuilt BIOS memory and storage diagnostics and everything seems to be fine though.

 

I did try resetting the BIOS to the defaults but no success. One thing that I am aiming for troubleshooting, is disabling a lot of BIOS options so that there is minimal hardware running and see if it does not crash. I would look to know if this is worth it or not and if yes, what should I aim to change/disable in the BIOS. Any help will be appreciated.

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/
Share on other sites

Link to post
Share on other sites

14 minutes ago, Gat Pelsinger said:

One thing that I am aiming for troubleshooting, is disabling a lot of BIOS options so that there is minimal hardware running and see if it does not crash. I would look to know if this is worth it or not and if yes, what should I aim to change/disable in the BIOS. Any help will be appreciated.

Try Linux, see if it still crashes. If it does, then inspect HW further.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16302405
Share on other sites

Link to post
Share on other sites

@Biohazard777 I don't think that it is an OS problem. As I said, it crashed when installing Windows. And anyways, Linux will not be installed on the machine, it needs to have Windows.

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16302411
Share on other sites

Link to post
Share on other sites

Just now, Gat Pelsinger said:

@Biohazard777 I don't think that it is an OS problem. As I said, it crashed when installing Windows. And anyways, Linux will not be installed on the machine, it needs to have Windows.

You are putting a lot of trust in Windows drivers 🤣.
Time needed to prep Ventoy and put a live Linux distro on it < 5 min... 
Anyhow, good luck.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16302413
Share on other sites

Link to post
Share on other sites

WHEA means a hardware issue with the CPU or a PCIe device. It's a bit more murky with laptops because the firmware sometimes does some weird stuff, but it's still one of those getting an error. It's just not necessarily as sure that it's hardware as with a desktop built with off the shelf parts. 

 

If you were able to install Windows go to C:\Windows\Minidump and check if you have any minidump files. If you do, go back to the Windows folder and copy the Minidump folder itself to the Downloads folder (You can use the desktop if you don't have OneDrive syncing files). Zip the copied folder and attach it to a post. Please follow the instructions to the letter as Windows doesn't like you messing with files in this location.

 

If you don't have any dump files (And you were able to install Windows) then the storage is the main suspect if it's an NVMe SSD. We need to see the arguments of the BSOD crash (Think of them as sub-errors). If it already hangs on the BSOD screen (As you can't get dump files) then this step is not necessary, but if it reboots normally after a few seconds then go to this guide and on this screen remove the check for automatically restart. To restart manually, just use the power button.

 

To make the BSOD screen display the additional info on the BSOD screen we need to add a field to the registry. If you are not comfortable editing the registry then do not do this step. Navigate to HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\CrashControl, right click on the empty area on the right section and select New → DWORD value with the name "DisplayParameters". Right click on it, modify and set the value data to 1 (Does not matter if you use Hexadecimal or Decimal). It should look like this once done. Reboot to apply the registry change.

 

The next time you BSOD, you should have these extra numbers in the top left corner. If Arg 1 (The top line) is 0x0000000000000010 then it's blaming the NVMe SSD. If it's not that then take a picture of the numbers.

 

Note that this is only applicable to WHEA_Uncorrectable_Error BSODs so make sure that's the error before taking a picture. 

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16302833
Share on other sites

Link to post
Share on other sites

@Bjoolz So I got some fresh memory dumps. For some reason some of them are 0KB but others have data.

 

https://drive.google.com/drive/folders/17mlJBBj-S2eCyIOuz2fna5O0VZ5WXBl0?usp=sharing

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16303883
Share on other sites

Link to post
Share on other sites

2 hours ago, Gat Pelsinger said:

@Bjoolz So I got some fresh memory dumps. For some reason some of them are 0KB but others have data.

 

https://drive.google.com/drive/folders/17mlJBBj-S2eCyIOuz2fna5O0VZ5WXBl0?usp=sharing

Only four of these were readable as the others were corrupted (You can see the 0b size). All of the ones that weren't corrupted show a hardware issue with the CPU. To be specific, an internal timer error in the CPU. 

 

Like I said before, with laptops you can more commonly get these kinds of errors from BIOS or driver issues so we will try that first. A faulty CPU is the main suspect though. The corrupted dump files are not really common with CPU hardware issues either, but looking at the time stamps it's possibly getting a series of crashes right after one another? The thought being that it might be crashing during the dump file creation. The dump file creation is done during the next boot so if you've recently had an issue with it just constantly BSODing at boot for a while then that could explain it. 

 

Updating the BIOS is not without risk because a crash during a BIOS update can brick the motherboard. You can at least update the Chipset driver as this is just software (Management Engine, Chipset Installation and Dynamic Platform and Thermal Framework). You can evaluate if you want to do the BIOS update depending on if the machine has any kind of warranty with it being refurbished. If it might have any warranty, contact the seller and ask if they want you to try updating the BIOS. 

 

Something you can do before updating the BIOS is cleaning it. You never now how well it's refurbished, if it's just cleaned on the outside or if they cleaned it inside as well. Ideally you would also re-paste the CPU cooler in case it's a thermal issue. 

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16304020
Share on other sites

Link to post
Share on other sites

@Bjoolz So what is the actual variable? One thing to note that the system never crashes when in BIOS. It once did crash when Windows was installing. So, probably, it is software? Also, what did you use to look at the dump files, and do all the dumps point to the same problem? Also, about the different memory I stated earlier, both of them are running at the exact same variables. A BIOS update might be possible as it does not crash in BIOS but it will be the last resort and I don't think updating that will even do anything. if BIOS is old, system straight no work. And how did you know I might need a chipset driver update? I will try to update it though. I am just trying to give you all information I have so you have more clues. Will the second method you provided earlier help?

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16304041
Share on other sites

Link to post
Share on other sites

27 minutes ago, Gat Pelsinger said:

@Bjoolz So what is the actual variable? One thing to note that the system never crashes when in BIOS. It once did crash when Windows was installing. So, probably, it is software? Also, what did you use to look at the dump files, and do all the dumps point to the same problem? Also, about the different memory I stated earlier, both of them are running at the exact same variables. A BIOS update might be possible as it does not crash in BIOS but it will be the last resort and I don't think updating that will even do anything. if BIOS is old, system straight no work. And how did you know I might need a chipset driver update? I will try to update it though. I am just trying to give you all information I have so you have more clues. Will the second method you provided earlier help?

If it crashes during Windows install then the CPU is likely toast, but you can try re-pasting the CPU cooler and making sure it's clean inside. All the dump files showed the same issue. I use WinDBG Preview to read the dump files, but debugging WHEA crashes is a bit more advanced as you have to use the Intel Programming Manual to decode the last four bytes of the lower MCi showed in the dump files. 

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16304076
Share on other sites

Link to post
Share on other sites

Try replacing (or even just removing) the wifi card, assuming it has a modular one. You mention it BSODs when trying to connect to the internet. I'd imagine the Windows installer is similarly trying to initialize your network adapters. And CPU-Z also attempts to connect to the internet on startup to check for updates.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16304425
Share on other sites

Link to post
Share on other sites

@QuantumRand I was thinking to do that, bit I think it's all probably just a co-incidence. Sometimes I don't get bluescreen when connected to Wi-Fi. Now, I came up with another theory. Putting the laptop on charging causes a BSOD. Nah, it's all a co-incidence.

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16304431
Share on other sites

Link to post
Share on other sites

6 minutes ago, Gat Pelsinger said:

@QuantumRand I was thinking to do that, bit I think it's all probably just a co-incidence. Sometimes I don't get bluescreen when connected to Wi-Fi. Now, I came up with another theory. Putting the laptop on charging causes a BSOD. Nah, it's all a co-incidence.

Since you know it's a hardware issue, you should try to eliminate as many variables as possible. Remove everything you can

  • Wifi Module
  • RAM (try one stick, if it still BSODs, try the other stick in the other DIMM)
  • Battery (only wall power)
  • Pull out the SSD and boot from a Live USB

Once you have it working, add things in one at a time to figure out the culprit. If it still BSODs with everything removed, you know it's an issue that isn't going to be fixed by replacing a faulty part.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16304435
Share on other sites

Link to post
Share on other sites

@Bjoolz @QuantumRand

 

Can you tell me more about what an internal timer is and why is that important? Is it something or related to something that I can disable in the BIOS? I am kind of desperate but look, the system actually runs. The computer actually computes. The CPU does compute. For a while it runs fine, and it also always runs fine in the BIOS (I will keep mentioning this), but one small thing is always crashing it. This one small thing, which is the timer or whatever you are talking. A computer either worky or not worky, if it worky, then why it not worky (lol)? If it works fine, there is one small thing that stops it. Can I not disable that or do anything to get past the problem?

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16305024
Share on other sites

Link to post
Share on other sites

5 hours ago, Gat Pelsinger said:

@Bjoolz @QuantumRand

 

Can you tell me more about what an internal timer is and why is that important? Is it something or related to something that I can disable in the BIOS? I am kind of desperate but look, the system actually runs.

Well, a CPU has several internal timers and I don't know which one is having an issue. You have the PIT which creates an interrupt every X clock cycles and keeps time that way, you have the TSC which counts the total number of CPU cycles since power on and probably more that I don't know about. The timer is used to synchronize everything in the PC. For a PC, not having precise and correct synchronization is like the world falling apart. So much in a modern CPU needs a timer to function. 

 

5 hours ago, Gat Pelsinger said:

A computer either worky or not worky, if it worky, then why it not worky (lol)? If it works fine, there is one small thing that stops it. Can I not disable that or do anything to get past the problem?

That's not really how it works. You can have faults in the hardware which have a random chance of triggering. Some issues can also be heat related as it expands when heated up making the issue more prevalent.

 

This is not something you can disable, it's an integral part of the CPU. 

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16305266
Share on other sites

Link to post
Share on other sites

8 hours ago, Gat Pelsinger said:

@Bjoolz @QuantumRand

 

A computer either worky or not worky, if it worky, then why it not worky (lol)? If it works fine, there is one small thing that stops it. Can I not disable that or do anything to get past the problem?

Just because a component isn't working properly doesn't mean it will always cause an error. Take memory, for example, say you have a whopping 8-bits of memory in your system, but the 8th bit is broken and always stuck on 1. If you try to write [1,1,1,1,1,1,1,1] to that memory, everything will be fine, but if you try to write [1,1,1,1,1,1,1,0], things aren't fine anymore.

 

Now imagine you have a bank of memory that's 16GB, but there's just one bit in there that's stuck. You'll only have a problem if you try to write (and then read) a 0 to that specific bit, but you have 16*8 gigabits to write to, so the odds of hitting that one bit are very low.

 

Now this isn't a realistic example. Memory is actually somewhat analog in that it has a voltage cutoff that it considers a 1. If a bank of memory isn't regulating that voltage well enough, sometimes it'll be fine, sometimes it wont depending on how much that voltage is wandering at the time.

 

This is just an example with memory. There's also things like timings as @Bjoolz mentions. These rely on traces to communicate synchronization signals and all sorts of other complicated things.

 

A single component on your system to cause an error by being out of sync or maybe it has a bad memory bank in its local memory, or maybe it has a bad capacitor that causes too much voltage variation, or maybe there's too much resistance on a trace that causes sync signals to be missed sometimes.

 

That's why it's often most effective to reduce a system down to the bare minimum components it needs to run. If it still isn't stable, then you know it's one of the few major components left causing the issue, and in the case of a laptop, isn't going to be simple to repair.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16305538
Share on other sites

Link to post
Share on other sites

@QuantumRand @Bjoolz

 

Good news! I basically fixed the system by sacrificing turbo boost. After playing with the BIOS settings for a while, it looks like BSOD occurs when the CPU boosts. I can keep the actual turbo boost setting on in the BIOS, but it doesn't work until the runtime power management feature is enabled, which causes BSOD. If I enable turbo boost but keep runtime power management disabled, the system will not crash, but the CPU is still not boosting and locked to 1.69 (nice?) GHz. If I force enable turbo boost using Throttlestop, the system does crash. So, it's not really even software. It's that the CPU cannot turbo or else it will crash, which does suck. The system works but I kinda really want the CPU to boost you know. Do you guys know any possible fix?

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to comment
https://linustechtips.com/topic/1555403-random-bluescreens/#findComment-16309473
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×