Jump to content

I've been folding with this computer since, I think, March with few problems. I have 1660S and then got a 2080ti and fold with both, plus my 3900X. 

A few weeks ago, first thing in the morning, my computer's fans were at 100%, which is very very unusual. Turns out it had crashed overnight sometime, no BSOD, and was hanging while booting. I reset it and things went on. I use my computer all day while teleworking and there's no problems: I run my work software in the foreground and fold with 20 cpu virtual cores plus my 2 GPU's. Then, like clockwork, it crashes overnight. I have only observed a crash twice (including while getting data for this post), and it just turns off.

I changed both power cables going to the 2080, looked at the 12V, 5V, and 3.3V bus voltages, removed the 1660S, and ran superposition, FurMark, and AIDA64 stress tests and everything is great (76C max on any of these tests, and I ran them for between 8 and 18 h).

I thought after running all these tests and removing the 1660S might have solved it, but when I was on my way to bed I could hear the fans, so it's a no-go. I figured this might be the place to ask for help since I think I've tried all the easy stuff (also, reinstalling F@H after it didn't like me removing the 1660S mid-fold).

 

Because I'm a data geek (the sole reason I bought an intelligent PSU), I've attached plots of the relevant data, as well as the raw data itself. Nothing jumps out at me. The data is from last night from 8:26 pm until I assume it died at 10:10 pm.

 

Any ideas?

 

Update since I started writing this: when I removed my 1660S, FAHControl wouldn't connect, so I reinstalled and moved on. Now, I want to try CPU-only folding and it's connecting. This is a new problem.

 

Complete-ish build list:

Ryzen 9 3900X @ 3.8 GHz, watercooled

32 GB DDR4 RAM

MSI Tomahawk B450 - BIOS 1.80 (had to try a bunch of ones until I found one that worked with this CPU, so it's old and I don't want to update it unless I have to)

EVGA GeForce RTX 2080 Ti XC Gaming 11G-P4-2382-RX

MSI Geforce GTX 1660 Super Ventus XS **currently sitting in a box, but has been installed most of this time**

M.2 SSD

Spinning rust HDD

Corsair AX1200i

 

crash data.csv Computer Info.pdf

Link to comment
https://linustechtips.com/topic/1258909-crashing-while-folding/
Share on other sites

Link to post
Share on other sites

If your PSU was at fault, it would be evident within a few minutes of load, not after hours. What's more ,the voltages look good. The GPU temps are a little warm, but within limits. When I get random crashes, I use a tool called BSODViewer (https://www.nirsoft.net/utils/blue_screen_view.html) to check for any blue screens that might have occurred. You may also want to run a quick memtest. What speed is this running at?

 

Removing a GPU without updating F@H's config will produce an error, where the F@Hcontrol will fail to connect. You can manually edit the config, found at:

%appdata%\FAHClient\config.xml

 

Current Build

 

Link to comment
https://linustechtips.com/topic/1258909-crashing-while-folding/#findComment-14128121
Share on other sites

Link to post
Share on other sites

3 hours ago, LazyDev said:

If your PSU was at fault, it would be evident within a few minutes of load, not after hours. What's more ,the voltages look good. The GPU temps are a little warm, but within limits. When I get random crashes, I use a tool called BSODViewer (https://www.nirsoft.net/utils/blue_screen_view.html) to check for any blue screens that might have occurred. You may also want to run a quick memtest. What speed is this running at?

 

Removing a GPU without updating F@H's config will produce an error, where the F@Hcontrol will fail to connect. You can manually edit the config, found at:


%appdata%\FAHClient\config.xml

 

So my config.xml is there but is (was) totally empty. Can't say I expected that result. Why didn't uninstalling and reinstalling fix it, I wonder? Surely all the locally-saved files were deleted and a clean install should reset everything, no? I also just copied the sample config.xml file, making the necessary changes for my username, etc., and it's still connecting.

Update: Reinstalling and launching the web thingy worked, and now the whole thing connects again, so there's that.

 

And the GPU temp is for FurMark or AIDA64's stress tests. For example, I'm BOINCing with Rosetta@Home for my CPU and Milkyway@Home for my GPU and it's sitting at 61C. 

I ran that tool you recommended and my last BSOD was 5/31, so nothing there. When it crashed this morning it was just off, just like I had unplugged it, except the fans ramped to 100% like it does in the first few seconds of booting, but it never ends (until I hit the reset button).

 

I've uploaded my log from starting it up just now, since it shows the configuration data, in case that's helpful.

FAH Log.txt

Link to comment
https://linustechtips.com/topic/1258909-crashing-while-folding/#findComment-14128623
Share on other sites

Link to post
Share on other sites

A system crash to where the only activity at the boot up process is 100% fans, would indicate hardware failure. This could be a number of things. It seems that the system boots fine, into Windows, but crashes on load after a few hours.

 

Best thing to go from is to check every cable in the system, to ensure that they're all connected correctly (And for any discolored cables). Check that the GPU/RAM/CPU are seated correctly aswell.

 

What's the temperatures like around the CPU mosfets? Are there any BIOS codes/LED's when the system crashes to just 100% fans (Should be around the 24 pin connector)?

 

You could use HWmonitor to check the mosfet and chipset temperatures.

Current Build

 

Link to comment
https://linustechtips.com/topic/1258909-crashing-while-folding/#findComment-14130632
Share on other sites

Link to post
Share on other sites

9 hours ago, LazyDev said:

A system crash to where the only activity at the boot up process is 100% fans, would indicate hardware failure. This could be a number of things. It seems that the system boots fine, into Windows, but crashes on load after a few hours.

 

Best thing to go from is to check every cable in the system, to ensure that they're all connected correctly (And for any discolored cables). Check that the GPU/RAM/CPU are seated correctly aswell.

 

What's the temperatures like around the CPU mosfets? Are there any BIOS codes/LED's when the system crashes to just 100% fans (Should be around the 24 pin connector)?

 

You could use HWmonitor to check the mosfet and chipset temperatures.

So I fiddled with a bunch of things, including the reinstallation of FAH. Nothing leapt out at me hardware-wise. I had already replaced the power cables going to the GPU, but I swapped them around, reseated the RAM, that sort of thing. It's now been >24 hours with no issues, so I don't know what it was, but it seems to be fine again. I wish I had some clue what it was, but I'll take it.


Thanks for the help. Now, back to folding.

Link to comment
https://linustechtips.com/topic/1258909-crashing-while-folding/#findComment-14132311
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×