Jump to content

Computer Shuts Down, but only in game.

Wasn't sure what section to put this in- but... here's my specs: (anything in brackets is something I replaced)

 

HP Z800 workstation board

Xeon X5680 [2x Xeon E5520's]

GTX 970 (tried my GTX 760 as well)

CoolerMaster 212 Evo [2x Coolermaster 212 Evo]

40GB Samsung PC3-10600R- HP and Cisco Certified [16GB Hynix PC3-10600R - HP certified]

ADATA SP550 480GB SSD (on PCI RAID controller)

2x2TB Hitachi 7200rpm drives (in mirror)

500GB Seagate Barracuda (7200rpm)

5x140mm case fans

600w ThermalTake Dual 12v rail PSU [650w ThermalTake Single 12v rail]

Corsair K95 RGB

3xHPw19b (In game resolution: 4320x900)

 

So as you can see by my specs I have been systematically replacing things hoping that at some point I guess right. I have a quite a bit of experience troubleshooting I must say, and I've never ran into a Gremlin this confusing. I'll start from the top here...

 

I bought a Z800 motherboard on eBay, threw two Xeons on it, drilled my own standoffs into an HPTX case, made a custom 24 pin to 18 pin motherboard power adapter (I have triple checked my pinouts, this is not the issue) and a 10 pin memory power connector (dido). Then I grabbed the PSU (that had always worked) and SSD out of my old rig, threw it all in, booted up, worked fine for months. However, I ran into issues with my SSD. It turns out that HP only "officially" supports "certain" SATA III drives to boot on, and I was experiencing blue screens while I was away from the machine, but never while I was using it. I bought a PCI SATA III RAID controller and it seemed to fix the issue. But I started experiencing a shut down issue every once in a while, completely randomly. Any time this would happen, I could check around in the BIOS logs, event viewer, anywhere... nothing. All it says is something to the effect of 'the previous shutdown was unexpected'. I got out a volt meter and checked the voltages coming off my PSU, my 12v rail was sitting at 8.5 Volts. (How????? XD) So I took the 600w PSU out of my old rig, and the voltages remained around 11.7V. I continued to experience the shutdowns, and the 650w PSU works just fine in my old rig (ASUS P5LD2, Xeon X5470 (4.2GHz), 4x2GB PC2-8500, GTX 760, ADATA SP550 240GB) which consumes more power than my Z800 rig. I then thought perhaps it was an issue with my dual CPU configuration. So I took out my second E5520. It ran fine for a week on one CPU and in the meantime I had ordered my X5680 thinking I had fixed the issue (The 5680 scores better than both 5520's combined, even ignoring single thread advantages). I happily put my X5680 in, and I was fine for a few more days, and boom- shutdown (s p o o k y). I checked in my BIOS and looked at the memory self test on my 16GB kit (I was using slot 3  in each bank, then switched to 3 and 5 when I went down to 1 CPU, since slot 1 has never worked). It seemed fine but as a troubleshooting step I opened up my dad's HP server (DL something with 2xE5504's and 144GB of RAM) and I filled slots 2,3,4,5, and 6. Suddenly slot 2 was erroring out as well. I lightly sanded the connector on the memory and reseated it, no luck.  I am currently running off 3,4,5 and 6. Some of my friends invited my to play Minecraft and I gave it a shot (I have Optifine maxed out, and KUDA cinematic shaders) and never experienced a shutdown. I then started playing BF4 and it shut down immediately. The only difference I could think of is Minecraft is OpenGL and all my other games are some flavor of DirectX. But this whole time, my computer has never shut down during rendering in Premiere Pro, DirectX benchmarks in passmark, etc. I somewhat feel like I have it narrowed down to motherboard at this point, and thankfully I am graced with enough spare parts that I can troubleshoot like this without spending more money (XD), but I have been hearing a lot of other people talk about how terrible ThermalTake is. I know I've sent them multiple emails a day for the last couple weeks and haven't gotten so much as a word back... when I try to call them it says either their reps are busy or it's after hours, at noon? Some people talk about PSU's just like mine dying within the first week, catching fire, etc.

 

So all this in mind, what should I try next? Motherboard, power supply, or something else?

 

EDIT: I would like to add that I put my specs into multiple PSU calculator's and only ever saw 450-500w required (>32A 12V if I recall)

 

IMG_6201.JPG

Edited by KRKATANAKID
Added detail.

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Gravemind said:

XqBz9Hj.png

Yessir

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

Starting point: what are the CPU and GPU temperatures in all these scenarios?

 

After discarding thermal shutdown: what is the capacity on the 12v rail(s) of your PSU(s)?

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, KRKATANAKID said:

 

600w ThermalTake Dual 12v rail PSU [650w ThermalTake Single 12v rail]

There's your problem, PSU calculators are inaccurate and 450-500W is close to the maximum output of your power supply.

If this only occurs during gaming, then power is your likely issue, either that or temperatures

 

hello!

is it me you're looking for?

ᴾC SᴾeCS ᴰoWᴺ ᴮEᴸoW

Spoiler

Desktop: X99-PC

CPU: i7 5820k

Mobo: X99 Deluxe

Cooler: Dark Rock Pro 3

RAM: 32GB DDR4
GPU: GTX 1080

Storage: 1TB 850 Evo, 1TB HDD, bunch of external hard drives
PSU: EVGA G2 750w

Peripherals: Logitech G502, Ducky One 711

Audio: Xonar U7, O2 amplifier (RIP), HD6XX

Monitors: 4k 24" Dell monitor, 1080p 24" Asus monitor

 

Laptop:

-Overkill Dell XPS

Fully maxed out early 2017 Dell XPS 15, GTX 1050 4GB, 7700HQ, 1TB nvme SSD, 32GB RAM, 4k display. 97Whr battery :x 
Dell was having a $600 off sale for the fully specced out model, so I decided to get it :P

 

-Crapbook

Fully specced out early 2013 Macbook "pro" with gt 650m and constant 105c temperature on the CPU (GPU is 80-90C) when doing anything intensive...

A 2013 laptop with a regular sized battery still has better battery life than a 2017 laptop with a massive battery! I think this is a testament to apple's ability at making laptops, or maybe how little CPU technology has improved even 4+ years later (at least, until the recent introduction of 15W 4 core CPUs). Anyway, I'm never going to get a 35W CPU laptop again unless battery technology becomes ~5x better than as it is in 2018.

Apple knows how to make proper consumer-grade laptops (they don't know how to make pro laptops though). I guess this mostly software power efficiency related, but getting a mac makes perfect sense if you want a portable/powerful laptop that can do anything you want it to with great battery life.

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, SpaceGhostC2C said:

Starting point: what are the CPU and GPU temperatures in all these scenarios?

 

After discarding thermal shutdown: what is the capacity on the 12v rail(s) of your PSU(s)?

RIght- forgot to mention. My CPU is at ~37C and GPU at 65C under load. 650w is a 52A 12 and the 600w is I believe 20 and 24A? Its that ballpark.

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, rattacko123 said:

There's your problem, PSU calculators are inaccurate and 450-500W is close to the maximum output of your power supply.

If this only occurs during gaming, then power is your likely issue, either that or temperatures

 

If that were the case wouldn't I be able to replicate in benchmarks? I ran Passmark and I also ran Prime95 on half my cores while running Furmark at the same time.

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

Do you guys think it is PSU and it stabilized out on one E5520 but with no other configuration just because of lower power consumption? I am running off the 650 right now and it's reading good voltage. In game it never goes low... Could a PSU calculator really be off by over 20A@12V? Thats 240w...

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, KRKATANAKID said:

RIght- forgot to mention. My CPU is at ~37C and GPU at 65C under load. 650w is a 52A 12 and the 600w is I believe 20 and 24A? Its that ballpark.

OK, so the theoretical capacity is fine, and I have no information on Thermaltake PSUs, but if you did measure too low voltages, it could be the PSU failing to deliver the right voltages under heavy load (notice that it has to be the case that it fails to do so at loads that it is supposedly able to handle)

 

3 minutes ago, KRKATANAKID said:

If that were the case wouldn't I be able to replicate in benchmarks? I ran Passmark and I also ran Prime95 on half my cores while running Furmark at the same time.

Well, it depends on how much of a simultaneous load on all 12v components you apply during benchmarks, as that's what games do. Although games will hardly stress all cores the way Prime95 does. You ran it on only half the cores, though: do you need more than one core free to run GPU benchmarks?

 

3 minutes ago, KRKATANAKID said:

Do you guys think it is PSU and it stabilized out on one E5520 but with no other configuration just because of lower power consumption?

It could be, although I don't think we are in position to conclude yet. Can you measure voltages (however inaccurate) with some software and keep a log of it? If so, I would run that software when playing a game and see if something happens there.

 

It could also be the GPU due to Minecraft working, but the fact that you tried two, and that you run GPU benchmark without issues says otherwise. But maybe a PSU failing to deliver the right amount of power or stable enough voltages to the GPU? Or even the motherboard through the PCIe slot? Speaking of which, it could be the motherboard failing to supply enough power to the CPUs as well, I don't know if you can  (being an OEM board) check which CPU combination it can exactly handle. Maybe the VRMs can't keep up with the dual CPU config, or with the single more powerful CPU. Can you find any compatibility list? Or maybe HP computers based on this motherboard and check their CPU config (maybe they were running lower TDP Xeons or something).

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, SpaceGhostC2C said:

OK, so the theoretical capacity is fine, and I have no information on Thermaltake PSUs, but if you did measure too low voltages, it could be the PSU failing to deliver the right voltages under heavy load (notice that it has to be the case that it fails to do so at loads that it is supposedly able to handle)

 

Well, it depends on how much of a simultaneous load on all 12v components you apply during benchmarks, as that's what games do. Although games will hardly stress all cores the way Prime95 does. You ran it on only half the cores, though: do you need more than one core free to run GPU benchmarks?

 

It could be, although I don't think we are in position to conclude yet. Can you measure voltages (however inaccurate) with some software and keep a log of it? If so, I would run that software when playing a game and see if something happens there.

 

It could also be the GPU due to Minecraft working, but the fact that you tried two, and that you run GPU benchmark without issues says otherwise. But maybe a PSU failing to deliver the right amount of power or stable enough voltages to the GPU? Or even the motherboard through the PCIe slot? Speaking of which, it could be the motherboard failing to supply enough power to the CPUs as well, I don't know if you can  (being an OEM board) check which CPU combination it can exactly handle. Maybe the VRMs can't keep up with the dual CPU config, or with the single more powerful CPU. Can you find any compatibility list? Or maybe HP computers based on this motherboard and check their CPU config (maybe they were running lower TDP Xeons or something).

So because of the way my power is rigged up, software does not work for monitoring voltage for anything but the 3.3v(which the translation for is done onboard), I have to actually probe the connector with a volt meter to know where I am at lol. I recorded the meter with my phone for hours while gaming and experiencing multiple shutdowns, I found no link to voltage vs shutdowns. I only ran half the cores to leave the other CPU to Furmark, I check taskmanager and I was maxed out the whole time,  but I guess I could retry Prime95 with my new configuration. The Rev2 Z800 board will "officially" support 1x5600 or 2x5500 cpus. A lot of people can make 2x5600 work without going to Rev3 though. This guy: http://andybrown.me.uk/2014/11/01/z800/ did something almost identical to me. Hopefully this can help you guys help me lol. He ran basically identical configurations to what I am running during different steps in his build before going to dual X5680's.

 

I'll run prime95 overnight and see what I get.... I'm jumping off for the night. Thanks for the help so far!

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

I had something similar happen to me before. The issue was that my PSU was faulty. So while the theoretical load was fine. It couldn't actually handle it. Have you tried a different PSU?

CPU: Intel i7 3770k | Motherboard: Asus Sabertooth Z77 | RAM: 4x4gb Corsair Vengeance Black | GPU: Asus STRIX GTX 970 | Case: Corsair C70 Black | Storage: 1x 120gb Crucial M4 SSD, 1x 2TB WD Black, 1x 2TB WD Green, 1x 250GB Crucial M500, 1x Samsung 850 EVO 500gb | Cooling: Corsair H100i

 

Display: 2x BENQ RL24050HT, 1xLG 29UC88-B | Keyboard: Razer BlackWidow Ultimate 2013 | Mouse: NAOS 7000 | Audio: Astro A40, Astro MixAmp, Beyerdynamic Custom One PRO, Altec Lansing Speaker | Microphone: Blue Yeti blackout edition

 

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, IceSentry said:

I had something similar happen to me before. The issue was that my PSU was faulty. So while the theoretical load was fine. It couldn't actually handle it. Have you tried a different PSU?

The different power supply I tried was of the same manufacturer, and was 600w instead of 650w. The theoretical load is under both so I figured if it was a PSU issue the 600w would be ample enough troubleshoot. Maybe it's just ThermalTake sucks?

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

So I ran prime95 overnight, I watched it for a while before I went to sleep and temps never went above 36C. The computer stayed running all night. Everything looks good but I got one error reported in results.txt.

 

"FATAL ERROR: Rounding was 0.5, expected less than 0.4

Hardware failure detected, consult stress.txt file."

 

First things I notice: 1)  stress.txt was never generated.... 2) I looked up this error and everyone else running into it was overclocking. The answer is always either reduce clock speed slightly, or increase voltage slightly. In my case (since I am not overclocking- wish I was) this could actually point to a voltage/PSU issue in my opinion. 

 

Everything else in prime95 looks good. I would say the RAM (the part that actually works/detects) and the CPU are not to blame. But I guess I knew this because I have switched both out entirely. This could also rule out an issue onboard regarding these components?

 

Is next step buy a slightly overbeefed (700-800w) Corsair PSU and see what happens?

 

One thing I could try is power the computer with 2 power supplies and see what happens, can't say I haven't done that before. I have a 300w I could use to power the motherboard (my 600w is in my old rig currently and I took it to school to work on a video for the next couple weeks), CPU, and probably the drives. Then power GPU, fans, and RAM with my 650w.

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

Ideas?

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

I was doing some reading on the standard tolerances of an ATX psu. The voltage can deviate from its target up to 5% before there are any issues. So 11.4V-12.6V for a 12V rail, while keeping in mind that anything below 12V is not desired regardless. I finally have photo evidence of my multimeter reading 11.38 Volts, while having every USB port filled, and running BF4 maxed out at 4320x900. Of course it didn't shut down this time; just my luck. I think a PSU should fix this issue... I'll do some shopping and see what I come up with.

 

I tried running Prime95 on all 12 cores instead of just 6, while running Furmark. And the Prime95 test bottlenecked my GPU to the point it was only drawing 8% of TDP and doing about 5fps in Furmark. If I run 6 cores I can get my GPU to 98.6% TDP and break down into the 11.41-11.45V range.

You're not a man unless you lost your virginity to a 2x4.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×