Folding @ Home on AWS GPU instances

Rybo · May 20, 2020

Has anybody actually tried running an AWS GPU instance to see how effective it is versus just buying a nice GPU yourself and paying for the electricity?

I'm looking at one of their GPU instances -- p2.xlarge -- and their spot pricing is pretty consistently $.27 per hour to run a GPU that seems like it could run about 50% of the throughput of my 2080 Ti (granted, I'm not fully aware of what F@H primarily uses from the card -- I'm assuming CUDA cores). Given the fairly consistent spot pricing, I could theoretically run the instance for about a week for the cost of the electricity it takes to run my home PC.

I'm interested in trying it, but the big question is whether their GPUs can actually perform F@H work efficiently. Has anybody done this? What were your results?

Electronics Wizardy · May 20, 2020

They should do the work fine, its just a system with a gpu.

Why not try it?

If you want to help, your probably better off donating money than paying for the aws time. Then let them decide if thats the best use of their. Also there is a good amount of markup, so thats why there getting 1/2 of the performance for the price.

Also what are your power prices? You would have to have very high power prices for a 2080ti to coast $.27 a hour in power to run

Rybo · May 20, 2020

Just now, Electronics Wizardy said:

They should do the work fine, its just a system with a gpu.

Why not try it?

If you want to help, your probably better off donating money than paying for the aws time. Then let them decide if thats the best use of their. Also there is a good amount of markup, so thats why there getting 1/2 of the performance for the price.

Also what are your power prices? You would have to have very high power prices for a 2080ti to coast $.27 a hour in power to run

I don't have a voltmeter so I'm estimating the cost of running my own GPU -- I'm guessing the cost of running my system full bore to be about 500 watts on the high end. That's about 360 kwh a month to keep it going 24/7. At my local pricing ($.12 per kwh), that's about 40 bucks a month. At that pricing, I could run an AWS instance at $.27 for roughly 7 days.

That's a good point about just donating the money, though. I just happened to look at AWS for some reason their free tier offer got me to spin up an instance, then I got curious and looked into their GPU instance pricing. It's just low enough to make me think it's worth doing every now and again when I want more PRECIOUS POINTS, but plenty high enough for me to figure I should just build another system: over the course of a year or so, the new system should be way cheaper.

Macaw2000 · May 21, 2020

Works quite well. My cluster runs on g4dn.xlarge instances when the spot pricing is under $0.15.

The fancy way to do it if you are so inclined: https://github.com/raykrueger/FoldingOnECS

Macaw2000 · May 21, 2020

On 5/20/2020 at 8:44 AM, Rybo said:

I'm interested in trying it, but the big question is whether their GPUs can actually perform F@H work efficiently. Has anybody done this? What were your results?

Oh and the results can be quite good. Here's the contest we had last month and I ran entirely on AWS

Rybo · May 22, 2020

20 hours ago, Macaw2000 said:

Works quite well. My cluster runs on g4dn.xlarge instances when the spot pricing is under $0.15.

Darn.

I did end up trying it on p2.xlarge and while it was working, it was taking like 15 minutes per fold (i.e., not good). It did start messing up and dumping work units so I gave up before running up a bill (I installed it on Amazon's custom command line Linux AMI, so I probably just needed to jump further down the "configure it properly" rabbit hole), but I wonder if the tensor GPUs in the g4dn instances are better aligned to F@H workloads or if my instance was simply not running properly.

I'm looking forward to a long 4 day weekend, so I might take another stab at this. I'd love to get it working if only to sate my curiosity (and to claw my way to 100m points a bit quicker...and to be able to ramp up production very quickly when the next F@H event hits)

ConceptOW · May 22, 2020

I've used AWS for folding and boinc before... the only thing you need to consider is your bill goes up fast. I wouldn't recommend it honestly. You can also look into doing P100 instances for 0.25/hr on Azure. It pulls about 1.4m to 1.6m. If you can, I would just get a gpu and run it from home. It's a better investment in the long run.

Regarding the AMI, I would use the Nvidia Deep Learning AMI. Easier to setup because you don't have to deal with drivers.

https://aws.amazon.com/marketplace/pp/NVIDIA-NVIDIA-Deep-Learning-AMI/B076K31M1S

Also consult this guide on setting it up:

Macaw2000 · May 23, 2020

If Docker is interesting... it's a far cleaner solution than an AMI. Nice example of the Docker config here: https://github.com/raykrueger/FoldingAtHomeContainer

Edit: Docker cleaner, AMI much easier barrier to entry and works well!

Rybo · May 23, 2020

9 hours ago, Macaw2000 said:

If Docker is interesting... it's a far cleaner solution than an AMI. Nice example of the Docker config here: https://github.com/raykrueger/FoldingAtHomeContainer

Edit: Docker cleaner, AMI much easier barrier to entry and works well!

About that...Docker seems to only work in the AWS cloud project (not plain old manually created EC2 instances), which seems to only be trivially accessible to the outside world via a load balancer. I'm sure I could eventually learn how to set up access tokens to get my local pc into the VPC and hook directly up to each instance in the cluster (I would like to have all of them linked up to my FAH control on my PC for monitoring), but bleh. Learning curve.

What I'll probably end up doing is harvesting the initialization commands and just executing them manually in manually created instances. It looks like you're using Amazon's AMI anyways, so presumably that closes the one gap I had in my first attempt -- configuring the instance with the proper drivers.

I did spend like 3 hours yesterday adding configuration to the FoldingOnEcs project so I could customize the spot pricing and cluster size via config file before I realized how profoundly...enterprisey the AWS cloud solution is. Not sure I'm up for the learning curve just for the sake of this project Buuuut, I finally got to use Typescript without React, so I consider it time well spent anyways.

Rybo · May 23, 2020

19 hours ago, ConceptOW said:

Regarding the AMI, I would use the Nvidia Deep Learning AMI. Easier to setup because you don't have to deal with drivers.

Good call. Gave it another stab on the g4dn.xlarge instance with the Amazon's base deep learning linux AMI. After a few minutes of ramping up, it seems to be matching my 2080 Ti in TPF: about 1 minutes 10 seconds. No unpleasant folding errors yet at 7% progress.

I'll let it run for a few hours and see if it will actually work this time

William M · May 23, 2020

When I had my network at full power on Folding @home. I was really used lot power. Although I went from just starting to about 550 place on ltt group in about month half. Then it got warm out. Were I would had to to run the AC to stay comfortable in place. So I have turned most computers off. Expect the one machine and sometimes run the other media center computer with it. I would guess that it would be cheaper just to run your own system. I also running your own computers maxed out give me little sense pride. Since most of the system that I ran were over clocked in one way or another. I did let some of my ESXI server run with extra cpu cycles. Other then enable XMP on them they at stock settings.

ConceptOW · May 24, 2020

17 hours ago, Rybo said:

About that...Docker seems to only work in the AWS cloud project (not plain old manually created EC2 instances), which seems to only be trivially accessible to the outside world via a load balancer. I'm sure I could eventually learn how to set up access tokens to get my local pc into the VPC and hook directly up to each instance in the cluster (I would like to have all of them linked up to my FAH control on my PC for monitoring), but bleh. Learning curve.

What I'll probably end up doing is harvesting the initialization commands and just executing them manually in manually created instances. It looks like you're using Amazon's AMI anyways, so presumably that closes the one gap I had in my first attempt -- configuring the instance with the proper drivers.

I did spend like 3 hours yesterday adding configuration to the FoldingOnEcs project so I could customize the spot pricing and cluster size via config file before I realized how profoundly...enterprisey the AWS cloud solution is. Not sure I'm up for the learning curve just for the sake of this project Buuuut, I finally got to use Typescript without React, so I consider it time well spent anyways.

Easier way to do it is create a snapshot of your instance and then create an ami. Then create a template for your instances with that ami. It will boot up with everything configured already.

Sign In

Folding @ Home on AWS GPU instances

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Announcements

Topics

Latest From Linus Tech Tips:

The Humble PC

Latest From Tech Quickie:

What Are the Download Speeds in Space?

Latest From TechLinked:

Goodbye, TikTok

Latest From GameLinked:

Video Games Are Dying.

Latest From ShortCircuit:

The World's Fastest CPU (Technically...) - Intel i9-14900KS

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!