Jump to content

So, over the past some 4 odd months my system has been crashing randomly, this can happen in the game, watching videos, or just idling at the desktop. I have been working endlessly to figure this out and have found nothing that works. The system was rock solid for around 5 months before this. What usually happens is the computer will be running perfectly(temps all under 60C) and then boom, both monitors go black and peripherals lose power. The fans will keep spinning in the case but it sounds like they momentarily lose power then rev back up. Half the time the system will boot back up automatically, the other half I will have to reset the power supply in order to boot back up. I have tried 2 other power supplies(one older and one brand new), reinstalled windows twice, ran chkdsk multiple times, ran memtest multiple times, tried a different GPU, Sent mobo into asus who said they couldn't replicate the issue, all to no avail. My motherboard(Asus b350-f strix) has a line of LEDs all listed as different components, I'm assuming these are error led's, the green one that supposedly means something to do with boot is always lit up even though there are no found errors in chkdsk. I did end up ordering a new SSD to see if that fixes it but I doubt it. Anyone have any idea this is really driving me crazy here.(there are a few instances where the system will run an auto chkdsk after one of these events if that helps) 

I added a snip of my most recent crash, the highlighted event happened almost at the exact time of the crash, the second one is from a month or 2 back when I thought kernel power was the issue.

 

TLDR: Random system crashes, no error code. Scans of hardware are inconclusive.

Thanks for any help!

 

Specs:

GPU: XFX Rx-480 4gb

CPU: Ryzen 5 1500x

MOBO: Asus B350-f strix

Drives:2 TB Barracuda HDD and 120gb Kingston SSD

PSU: EVGA 550W 80+ gold modular PSU

RAM: 16gb team group ddr4

 

 

 

Captudafdasfdsafare.PNG

Capture.PNG

Link to comment
https://linustechtips.com/topic/969131-system-crashing-at-random-intervals/
Share on other sites

Link to post
Share on other sites

I REALLY wanna say ram, drop the clock speed a teeeny tiny amount. Other than that check mounting from cooler isnt too tight(prolly not this), is your electricity in your house stable? are motherboard grounds/standoffs/screws all in place? Is the outlet you are using ACTUALLY grounded?

Link to post
Share on other sites

On 9/6/2018 at 12:15 PM, itisme911 said:

I REALLY wanna say ram, drop the clock speed a teeeny tiny amount. Other than that check mounting from cooler isnt too tight(prolly not this), is your electricity in your house stable? are motherboard grounds/standoffs/screws all in place? Is the outlet you are using ACTUALLY grounded?

I'll try dropping the clock speeds, I never know how tight the cooler should be but its pretty tight so I might also try backing the screws off a bit(just the standard wraith cooler). And I honestly, have no idea about my house's electricity as the wiring is a bit... odd...(I'll be moving to my dorm soon so I guess that will make it obvious if it is my hoses wiring) but most of my system is connected to a power strip and the monitors don't seem to lose power. 

Thanks!

Link to post
Share on other sites

On 9/6/2018 at 12:15 PM, itisme911 said:

I REALLY wanna say ram, drop the clock speed a teeeny tiny amount. Other than that check mounting from cooler isnt too tight(prolly not this), is your electricity in your house stable? are motherboard grounds/standoffs/screws all in place? Is the outlet you are using ACTUALLY grounded?

alright, so i dropped the clock speed from 2133 to 1866 and it still crashed... any other ideas? I'm not sure what to do any more, short of a completely new build which i cant afford lmao

Link to post
Share on other sites

On 9/9/2018 at 5:08 PM, BhamBeast said:

alright, so i dropped the clock speed from 2133 to 1866 and it still crashed... any other ideas? I'm not sure what to do any more, short of a completely new build which i cant afford lmao

it has to be some bad hardware or drivers, could even be windows. try rolling back to past drivers and windows if you can

Link to post
Share on other sites

Hi, I do not mean to hijack your thread. But, I have been experiencing exactly the same issue. I have a Gigabyte uda5 990fx motherboard running a cooler master 850W PSU, AMD Phenom X II processor. 

 

I started experiencing boot issues out of the blue. This was about six months ago. Experienced all the similar symptoms and tried all the troubleshooting steps as you mentioned earlier and more. Could not get it to work properly. The only thing I did not test is swapping the CPU. After trying multiple times the system stopped booting altogether. the system would never boot up, and kept trying to restart. As seen from the error codes, it got stuck at the 0202 and restarted. So I ended up sodering off the bios chip and reflashing it. After about 2 to 3 months of troubleshooting, I sodered back the original bios chip and it started working.

 

It was woking perfectly fine till yesterday (just when I was finishing to recover data from a HDD), It again started doing the same thing. It works for sometime (matter of minutes), the screen goes black and then restarts automaticlly. Before restarting, the system works for a few seconds (I can here the audio and I can use the keyboard to shutdown or restart). I just cannot get any consistency in order to narrow down the problem. This happens in Windows,Ubuntu and sometimes on the boot screen as well. 

 

Although, the restarting ritual is new, as currently it is booting up after restart, I am afraid it will brick my pc again. At this stage, I am thinking either the PSU or the motherboard runs into some hardware fault and the bios restarts the PC. Or, could it be that the CPU is faulty?

 

 

 

 

Link to post
Share on other sites

21 hours ago, Massa_f430 said:

Hi, I do not mean to hijack your thread. But, I have been experiencing exactly the same issue. I have a Gigabyte uda5 990fx motherboard running a cooler master 850W PSU, AMD Phenom X II processor. 

 

I started experiencing boot issues out of the blue. This was about six months ago. Experienced all the similar symptoms and tried all the troubleshooting steps as you mentioned earlier and more. Could not get it to work properly. The only thing I did not test is swapping the CPU. After trying multiple times the system stopped booting altogether. the system would never boot up, and kept trying to restart. As seen from the error codes, it got stuck at the 0202 and restarted. So I ended up sodering off the bios chip and reflashing it. After about 2 to 3 months of troubleshooting, I sodered back the original bios chip and it started working.

 

It was woking perfectly fine till yesterday (just when I was finishing to recover data from a HDD), It again started doing the same thing. It works for sometime (matter of minutes), the screen goes black and then restarts automaticlly. Before restarting, the system works for a few seconds (I can here the audio and I can use the keyboard to shutdown or restart). I just cannot get any consistency in order to narrow down the problem. This happens in Windows,Ubuntu and sometimes on the boot screen as well. 

 

Although, the restarting ritual is new, as currently it is booting up after restart, I am afraid it will brick my pc again. At this stage, I am thinking either the PSU or the motherboard runs into some hardware fault and the bios restarts the PC. Or, could it be that the CPU is faulty?

 

 

 

 

Well here's what I know, so far I have tried 3 different power supplies, none of which fixed it so I'm pretty sure its not that, the ONLY component I haven't rigorously tested(or swapped out) is the CPU, simply because I have no idea how to test it and can't afford another one. The only CPU test I have run was prime 95 which didn't seem to accelerate the crashing so that didn't get me anywhere. Mine also crashes across operating systems but mine will usually stay on for multiple hours before crashing so that's a bit different. I'm still stuck between CPU and motherboard bc I don't really trust Asus RMA, ill update you if I make any progress.

Link to post
Share on other sites

On 9/10/2018 at 8:43 PM, itisme911 said:

it has to be some bad hardware or drivers, could even be windows. try rolling back to past drivers and windows if you can

Would it still be windows even if I did a completely clean install on a separate drive? And how could I eliminate options here bc it is still literally anything, is it possible that I could at least differentiate between software vs hardware? thanks, fam.

Link to post
Share on other sites

39 minutes ago, BhamBeast said:

Well here's what I know, so far I have tried 3 different power supplies, none of which fixed it so I'm pretty sure its not that, the ONLY component I haven't rigorously tested(or swapped out) is the CPU, simply because I have no idea how to test it and can't afford another one. The only CPU test I have run was prime 95 which didn't seem to accelerate the crashing so that didn't get me anywhere. Mine also crashes across operating systems but mine will usually stay on for multiple hours before crashing so that's a bit different. I'm still stuck between CPU and motherboard bc I don't really trust Asus RMA, ill update you if I make any progress.

OK, I remember once running HOT CPU tester ( http://www.7byte.com/index.php?page=hotcpu ) and running diagnostic test. It took a few hours to complete after which my computer ran for a month without a hitch. Today, I tried running the diagnostic tool again (this time the CPU burner test - which puts full load on the CPU and not the RAM or Motherboard) the system crashed in two minutes. My PSU and other peripherals are working fine. I have tested them extensively.

 

Next, I capped the cpu clock ratio to 2000MHz (x10) instead of the max 3500MHz, is worked for 10-15 min and then blank screen again. However, it is still showing online on Teamviewer but not giving me remote access. Basically, no response. 

Which still does not clarify whether it is the CPU or the motherboard.

 

I am trying to find someone with a AM3+ socket motherboard, so that I could test the CPU on a diffrent board and the other way around. This is a big issue in itself as I have not been able to find anyone with an AMD machine yet.

 

The two deductions I am leaning towards are:

1. Again, the CPU has some issues.

2. There is some cold solder or component damage on the motherboard which gives way. In which case, it is very difficult to pin-point.

 

FYI: I looked at my event log, I am geeting the same pattern of errors. First the Kernel Power, then the Distributed COM error at the very end. Maybe we can get an idea by analysing this.

 

Link to post
Share on other sites

39 minutes ago, BhamBeast said:

Would it still be windows even if I did a completely clean install on a separate drive? And how could I eliminate options here bc it is still literally anything, is it possible that I could at least differentiate between software vs hardware? thanks, fam.

To me, I am sure it is the hardware because my system faces this problem even without any harddisk attached.

Link to post
Share on other sites

13 hours ago, Massa_f430 said:

 

The two deductions I am leaning towards are:

1. Again, the CPU has some issues.

2. There is some cold solder or component damage on the motherboard which gives way. In which case, it is very difficult to pin-point.

 

OK I also saw your later post, it is probably not windows. You seem to be on track here. A way to test is to try a beefier cooler on Cpu, and to test the motherboard take off side panel and put a big ass fan, or a bunch if case fans blowing directly onto VRM.

Link to post
Share on other sites

10 hours ago, itisme911 said:

OK I also saw your later post, it is probably not windows. You seem to be on track here. A way to test is to try a beefier cooler on Cpu, and to test the motherboard take off side panel and put a big ass fan, or a bunch if case fans blowing directly onto VRM.

Tried all that. No solution. The Northbrudge was getting much hotter than I think it should, but cooling it down Does not help either.

 

@BhamBeast: I tried running less number of cores. Still the same issue.

 

Looking back when the problem started,

 

1. I'm 90% sure it is the motherboard, here's why..................Just before this started happening, my UPS had gone berserk which prompted me to think it was the PSU. Evidently, that was not the case. At that time when I was trying out different components, I saw the Northbridge short out because of a Mosquito. Then all the drama with the bios and the rest of the story as mentioned above. 

My system is about 7-8 years old, so I guess this is it. Although, now I will try and hot gun al lthe components in my free time to see if I can get rid of any cold solder. But, your system sounds newer, I recommend find a friend and test your CPU and motherboard out before purchasing anything.

 

2. 5% it is the CPU.........My CPU could have worn out. Again, it is highly unlikely, as I still have a 15 year old intel celeron (and about four more PC's/laptops), which is working fine. The only thing which could have damaged it is the UPS or the mosquito. But this case also has a low probability as most of the components are high grade and have redundancies built in for overvoltage and short circuit. So, If the other components are fine, there is no reason for the CPU to suddenly act like this after running fine for years.

 

3. Remaining probability is that there is a Bug in the bios: Can the Bios suffer from a virus? If it can, only windows would have been able to alter it, as I am still running my old windows. I have reflashed the bios but never reinstalled windows. So if there is a bios editing virus in it, it could trigger the system to be unstable.  However, I have maintained my PC thouroughly, even removed registeries and files, manually. But still, there are a lot of things I could have missed or messed something up.

In this case I would suggest reinstall everything with the proper mehtod and order. I am sure you have already tried resetting CMOS and default BIOS settings.

 

All the best......i will keep you posted if I have any development.

Cheers!!

 

Link to post
Share on other sites

18 hours ago, Massa_f430 said:

I will try and hot gun al lthe components in my free time to see if I can get rid of any cold solder. But, your system sounds newer, I recommend find a friend and test your CPU and motherboard out before purchasing anything.

 

Hotgunning and oven heating to reflow solder is at best a temporary solution and can ruin good components if you dont know which is the problem. Luis Rossman and Linus even have a video explaining this.

Link to post
Share on other sites

@itisme911 You are absolutely correct. Thanks for the reminder, I have bricked my ipad2 doing this sort of thing in the past. 

 

Firstly, I would like to state that I am not a professional and am not giving professional advice. I am just an enthusiast, who is not able to get professional help either because no one would spend the time on my system or I just don't have the money to spend to get the system repaired. It is not worth it as my system is just too old. Having said that, do see if getting a professional help makes sense for you before trying any sort of repair on your own.

 

Secondly, If you plan on going ahead on your own, when I said try hot gunning the components, I meant after pin-pointing the appropriate problem location using sound judgement, as you would know your own machine the best and the sequence of mods/problems that the system has encountered. Randomly doing it would only damage your device more as stated by itisme911. 

 

Coming back to the problem, I have something in mind particularly for my motherboard. I have noticed that the heat sink on my Northbridge is not nearly as hot as the bottom of the board. I had removed the thermal pads and put thermal grease long time ago. After removing the heat sink, I see not all the chips are properly in contact with the heat sink. COULD THIS BE THE ISSUE? I have ordered thermal pads and will try to check if it works again. 

 

Will post an update either way. 

 

 

Link to post
Share on other sites

Guys, I have the same problem with the crashes out the blue black screening then kick the fans on full blast. The only thing I have found that stops it is to remove my NVIDIA GTX 770 video card and use the on board MB video card and things work fine no more problem. I haven't tried a new Vid card as of yet but never fails every time I place it back in the PC my problems persist again. I heard it might be something with the PS but I haven't swapped it out yet. I read earlier you already swapped yours out and it still continued though. I have even went as far as taking my vid card apart cleaning it and applying new Thermal. I would test the card in another machine but I don't have access to one.

Link to post
Share on other sites

21 hours ago, Massa_f430 said:

contact with the heat sink. COULD THIS BE THE ISSUE? I have ordered thermal pads and will try to check if it works again. 

 

Will post an update either way. 

 

 

This is probably at least contributing to instability.

Link to post
Share on other sites

On 9/17/2018 at 2:04 PM, Khazgul said:

Guys, I have the same problem with the crashes out the blue black screening then kick the fans on full blast. The only thing I have found that stops it is to remove my NVIDIA GTX 770 video card and use the on board MB video card and things work fine no more problem. I haven't tried a new Vid card as of yet but never fails every time I place it back in the PC my problems persist again. I heard it might be something with the PS but I haven't swapped it out yet. I read earlier you already swapped yours out and it still continued though. I have even went as far as taking my vid card apart cleaning it and applying new Thermal. I would test the card in another machine but I don't have access to one.

Very high probability it is the GPU. One of mine is also dead now. First things first, try a different GPU, and try yours in a different PC. If that doesn't give you an answer, then you can think of other things.

Link to post
Share on other sites

@BhamBeast, did you get a chance to troubleshoot some more? Try checking the temperatures under the board manually (the sensors do not reflect any localized effect). I feel the temperature under a couple of MOS chips to be way more than the other ones in the VRM circuit.  Now, I am 99% sure that the VRM circuit is giving up after some time which should also explain the Kernel Power error event logged in windows.

 

Let me know what you think?

Link to post
Share on other sites

  • 2 weeks later...

I have an update on the situation. See the attached picture. I removed a few mosfets, which looked like were bulging, and tried running the computer again. It works!! Although, I had to disable three out of six cores of the cpu but I did not get any restarts for couple of days straight. 

 

So, in my case I can finally narrow it down to the VRM circuit. I may have removed a few more mosfets  than required, which I will try to put back without damaging the board. (I have bought a cheap backup board just in case).

 

To summarize possible diagnostics for this situation of random restarts, If you have tested all the removable components of the board, then it is definitely as problem in the VRM of the mother board. 

 

20180925_221445.jpg

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×