Jump to content

Help please, trying to diagnose problem

Go to solution Solved by Shoxxx,

Double posting because previous post incorrect.

 

Eventually my pc would not boot with the 280X plugged in at all and neither would it in other computers, the graphics card had failed.

 

Am now running a Radeon 390 quite happily

Hi

 

Yesterday I had all three of my screens switch to different patters from white to gray while having eve online in windowed mode on one and normal windows applications on the other two. The reset button on my pc did not work so I powered off my PC, re-seated my graphics card and everything was fine again for a hour or so until it happened again. This time I unplugged graphics card and the system would still not start, if I unplugged all ram the error code on my board would hit that first but if I had ram plugged in (in varying configurations) with nothing else I'd get stuck on a error code that is in the range of "CPU DXE Initialization", not sure what happened but moved it to test bench (Which involved unplugging all my USB devices too) it booted. Googling this error pointed to bios issues but I have not updated mine in months and they both match (Dual bios motherboard)

 

Everything is working fine now but I dont know if a graphics card failure can temporarily affect a motherboard or not. 

Is there any diagnostic software I can run to possibly trigger this again or detect errors ?

 

Thanks in advance

 

Specs:

3 * 23" Screens

PowerColor Radeon 280X

i5 2500k

Gigabyte GA-Z77X-UD3H

4x4GB Corsair DDR4 4GB (16GB Ram)

Seasonic X650 Psu

Link to comment
https://linustechtips.com/topic/596694-help-please-trying-to-diagnose-problem/
Share on other sites

Link to post
Share on other sites

If everything is working now, I would check to see if there is a new BIOS version for your motherboard; there could of been an issue/bug with your version of BIOS that was triggered recently by whatever you were doing. Just because everything was working before doesn't mean there isn't a BIOS issue; there could be issues with a particular version of BIOS but doesn't mean everyone will be affected at the same time or at all even. Since you have DualBIOS, update one of them to the latest version assuming you are not on it already, then reset BIOS settings to default and boot into windows; I would suggest just using 1 Display and run various benchmarks/gpu tests like Aida64, realbench, heaven, valley and so forth. If everything is stable then connect the other displays and repeat the process and if everything is fine again, OC your CPU/GPU if you had it OCed before but from scratch instead of loading profiles. If still no issues, continue using your PC normally.

Link to post
Share on other sites

3 hours ago, Shura said:

If everything is working now, I would check to see if there is a new BIOS version for your motherboard; there could of been an issue/bug with your version of BIOS that was triggered recently by whatever you were doing. Just because everything was working before doesn't mean there isn't a BIOS issue; there could be issues with a particular version of BIOS but doesn't mean everyone will be affected at the same time or at all even. Since you have DualBIOS, update one of them to the latest version assuming you are not on it already, then reset BIOS settings to default and boot into windows; I would suggest just using 1 Display and run various benchmarks/gpu tests like Aida64, realbench, heaven, valley and so forth. If everything is stable then connect the other displays and repeat the process and if everything is fine again, OC your CPU/GPU if you had it OCed before but from scratch instead of loading profiles. If still no issues, continue using your PC normally.

Hi

 

Thanks for the response, latest stable bios is 4 years old but re-flashed anyways, ran the AIDA64 stability test for a hour with no issues, ran the Realbench Benchmark with no issues then ran realbench stress test, passed 15 minutes at 8GB ram (I have 16) and then second test failed at 13minutes, third test failed at 26 minutes. No bluescreens just "System Instability Detected" with last line in the blender console being "Blender:BLF_lang_init: 'locale' data path for translations not found, continuing". Temperatures were fine, not overclocking.

 

Heaven and Valley ran fine.

 

So really just that Realbench thing.

 

 

Link to post
Share on other sites

2 hours ago, Shoxxx said:

Hi

 

Thanks for the response, latest stable bios is 4 years old but re-flashed anyways, ran the AIDA64 stability test for a hour with no issues, ran the Realbench Benchmark with no issues then ran realbench stress test, passed 15 minutes at 8GB ram (I have 16) and then second test failed at 13minutes, third test failed at 26 minutes. No bluescreens just "System Instability Detected" with last line in the blender console being "Blender:BLF_lang_init: 'locale' data path for translations not found, continuing". Temperatures were fine, not overclocking.

 

Heaven and Valley ran fine.

 

So really just that Realbench thing.

 

 

Interesting, the thing about Realbench is it tests everything and not just cpu/ram/gpu but a lot of subsystems too; like bus speed, power supply. motherboard and so forth. You mentioned you had a test bench, by test bench do you mean you have a separate set of hardware like motherboard/power supply or did you mean you had a physical test bench you installed all your current components to? If it is the former and you are interested in figuring out what component is causing the instability, I would suggest testing each component individually on the test bench.

 

If you don't have spare components, then I would start by removing the GPU and all but 1 stick of ram, set the stress test to 4gb and 30mins, at the end of each test, add a stick of ram. This is assuming the test ends without issue, if an issue arises on the first stick, try a different stick, if same issue again, good chance the problem is coming from motherboard or PSU. If you make it up to 4 sticks without an issue, test all 4 at 16gb for 1H(2 would be ideal but we can try that out later); if you pass 1h without issues, add the GPU now and test all again. If no issues try 3h and if no issues again, do one final test overnight for 8h. I know this sounds tedious but if you are interested in knowing  why the issue is happening, this is one of the few ways to go about it. Also I would personally try these tests on a Fresh Install of Windows since bad drivers/file corruption on the storage drive can also cause issues.

Link to post
Share on other sites

18 minutes ago, Shura said:

Interesting, the thing about Realbench is it tests everything and not just cpu/ram/gpu but a lot of subsystems too; like bus speed, power supply. motherboard and so forth. You mentioned you had a test bench, by test bench do you mean you have a separate set of hardware like motherboard/power supply or did you mean you had a physical test bench you installed all your current components to? If it is the former and you are interested in figuring out what component is causing the instability, I would suggest testing each component individually on the test bench.

 

If you don't have spare components, then I would start by removing the GPU and all but 1 stick of ram, set the stress test to 4gb and 30mins, at the end of each test, add a stick of ram. This is assuming the test ends without issue, if an issue arises on the first stick, try a different stick, if same issue again, good chance the problem is coming from motherboard or PSU. If you make it up to 4 sticks without an issue, test all 4 at 16gb for 1H(2 would be ideal but we can try that out later); if you pass 1h without issues, add the GPU now and test all again. If no issues try 3h and if no issues again, do one final test overnight for 8h. I know this sounds tedious but if you are interested in knowing  why the issue is happening, this is one of the few ways to go about it. Also I would personally try these tests on a Fresh Install of Windows since bad drivers/file corruption on the storage drive can also cause issues.

Test bench I just mentioned a actual physical environment where I can work properly with my pc. I have separate power supply and grabbed a graphics card from work today. 

 

With what you said about RealBench it makes sense that it's the only one detecting instabilities when paired with the " CPU DXE Initialization" code error that I was receiving when the pc was not booting since from what google tells me thats system components being initialized.

 

I really just need to rule out my graphics card from all this as its either replace cpu/motherboard/ram (Parts availability sucks) or replace graphics card. Will update thread with progress.

Link to post
Share on other sites

So I did more reading on the error code and one or two people actually had issues when there was too much usb power draw it could affect system startup. So when I unplugged everything to move my computer it must not have been fluke that it started working.

 

Tested with realbench 1hr with, onboard, a spare gfx card and my 280x and all passed tests working fine now.

 

No idea what it was that was messing with the test originally.

Link to post
Share on other sites

Double posting because previous post incorrect.

 

Eventually my pc would not boot with the 280X plugged in at all and neither would it in other computers, the graphics card had failed.

 

Am now running a Radeon 390 quite happily

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×