Jump to content

n0xlf

Member
  • Posts

    41
  • Joined

  • Last visited

Everything posted by n0xlf

  1. I moved 90C over to this last night but the CPU tasks are pretty slow.
  2. Did some say "if you have the cooling you can do both"? One of the few times I get to use 18 cores @ 4.3GHz, 128 GB of memory, and a 2080Ti @ 2070MHz. All on liquid! Sad part is a more modern air cooled system would stomp on it, but such is life in the computing world...
  3. I threw 82C at this..My 9980XE are all at 4.3GHz so it's doing pretty good.
  4. Wonder if I should force it to only use ECM P2 to give my memory something to do
  5. @Macaw2000 The only thing you missed out on was the heat and fan noise I'm very much looking forward to some peace and quiet in the house once all of my fans spin down. Not done folding by any means, but a little break will be nice! Many thousands of posts in such a short time provided a wonderful geeking out experience. For the younger ones in the crowd, this is something you'll remember decades from now as you reminisce about your "slow" 2020 gig internet, power hungry hardware, and the mere exaflop that we were excited about. From the top of the list all the way to bottom, the individual effort for helping humanity was the same. Congrats to all!!
  6. I like this stat of mine (thanks to AWS): Active clients (within 50 days) 102
  7. Local news did a story on NCAR in Colorado using their supercomputer for COVID-19: https://www.9news.com/video/tech/science/supercomputer-called-cheyenne-is-helping-crunch-covid-19-numbers/73-653e6803-ad6d-4d7b-a449-13d6f4e4b872 5 petaflops is cool and all (and way more power efficient than us), but doesn't touch what we're doing!
  8. Damn, you really ramped up your resources there!! Cool to see it presented that way...
  9. Both options are available (destroy entirely or keep). Keeping the instance (storage) is new for 2020, so it wasn't that way before. There are multiple ways to bring the same instance back, so you could do it without losing the WU progress.
  10. I'm on day 17 and still have a bit of a cough. The bad stuff stopped about 5 days ago.
  11. It's also worth saying that although the cloud stuff is fun in its own right, nothing beats listening to the fans and water pump all ramped up as you sweat it out in your computer room watching the power meter spin! Having suffered through this virus, I have a personal interest in this project. For me, it was like breathing through a small straw for days on end. The mental part was horrible.
  12. https://aws.amazon.com/ec2/spot/pricing/ Spot instances are subject to being shut down at any time, but are far, far cheaper than dedicated. You want the "GPU Instances - Current Generation", Linux if you can. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/accelerated-computing-instances.html You have to be really careful with AWS though. If you don't set certain limits on spot pricing, it can get pricey quickly. If you are going to fire up in mass scale, that's a whole different discussion as there are a lot of different ways to do it. Beyond that, I wouldn't do anything at the moment until more of the WU servers come online. I ran my 32 GPU servers and ended up with a lot of idle GPUs, which is pointless considering that the cost is the same.
  13. It's pulling it from nvidia-ml-py, which I believe actually pulls it from nvidia-smi. It's a python lib.
  14. Here are power stats for the GPUs - 5 don't have WUs at the moment:
  15. @Macaw2000 is already doing this - Maybe he can chime in with the PPD on those. I think it's about half of the V100 just based on CUDA cores at least.
  16. A friend and I are experimenting in AWS - Don't worry, this isn't going to be running the entire event (maybe).
  17. Yeah, it seems to average about 4M PPD per card x 32 - So that's the 128M, not including the 256 CPUs which aren't all that much in reality.
  18. Someone please make me stop - This is getting out of control!! 256CPU cores, 2TB memory, 32 NVidia V100s with 163,840 CUDA cores. About 128 million PPD.
  19. I kept it simple and just did an SSH forward to localhost:7396. It was hard enough configuring the entire thing via CLI anyway The config.xml parameters don't seem to be entirely documented anywhere that I could find.
  20. A.) This one is a p3.16xlarge @ $7.344/hr. as a spot instance. It's more cost effective to do 16 g4dn.xlarge with the T4 GPUs, but you have to deploy that with something like https://github.com/raykrueger/FoldingOnECS. B.) Intel(R) Xeon(R) CPU E5-2686 v4, / Tesla V100 C.) Based on this, yes: https://lambdalabs.com/blog/best-gpu-tensorflow-2080-ti-vs-v100-vs-titan-v-vs-1080-ti-benchmark/ - Titan V is Volta and V100 is Tesla, so there are other differences I believe.
  21. I couldn't wait, so here's some AWS folding pr0n (depending on the exact project, the GPU's are doing about 32 million PPD, although it floats between 2M-4M per GPU so hard to say - In any case, it's 40,960 CUDA cores)
  22. @Macaw2000 pointed out that the g4dn.xlarge is significantly less expensive if anyone is going to play with this. Looks like the T4 has half the CUDA cores of a V100, so even if you ran 16 of these to compare to a p3.16xlarge, it would only be $2.6208/hr. The spot price on the g4dn.xlarge is dirt cheap. He also shared this for scaling your deployment: https://github.com/raykrueger/FoldingOnECS
×