Jump to content

Tomatodude95

Member
  • Posts

    150
  • Joined

  • Last visited

Posts posted by Tomatodude95

  1. 32 minutes ago, Zberg said:

    Honest question(for the mercenary kings) how much does it end up costing per day to run those high point totals.  Is it reasonable or expensive?

    I have looked at it during the competition and tried it out a bit, so I'm no expert. Also I only tested Google Cloud so this might differ significantly from other providers.

    If you are willing to put in the work to deal with preemptible instances in Google Cloud you can setup a Node with 8 vCores and 4 Nvidia T4s for about 0.5 USD an hour, which makes it about 12 USD a day. If you go non-preemptible it becomes quite a bit more expensive, it'll be something like 1.18 - 1.70 USD per hour depending on how long you run it (the longer you do the more discount you get, ie. the closer to the 1.18 USD you get). You might be able to go with lower configs for the CPU, if you aren't using the CPU. I tried the 4 vCore version as well, but if I folded on the CPU as well I had an issue where the system ran out of RAM. (You get about 1GB of RAM per core.) However, the difference between 4 and 8 vCores is less than 3 cents per hour (About 22 USD for 24/7 preemptible usage per month).

    AFAIK on AWS it's a bit different because you can more or less decide at what price you want you instances to run and as long as prices are lower than you predefined price and AWS doesn't need the machines otherwise they will run. Anyone who is more informed about Spot instances on AWS can correct me if I'm wrong please!!

     

    Auzure has similar offers and possibilities, but I'm not informed about those.

     

    So it really depends on your definition of reasonable and expensive. Might make sense if you need some power in the short term or have some free credit to try it but it's still quite a chunk of money, I mean the cheapest option I presented costs 12 USD/day or 360 USD/month.

    For me it was fun to play with and good to get some experience with but I wouldn't consider running it in the long term. Anyhow if you want to run hardware over a longer period of time it is usually cheaper to buy the hardware and run it at home/locally.  (Yes, I know not always, there are benefits to having stuff in the cloud.)

     

    Additional Info about Performance:

    So if you don't know what the performance is like of a Nvidia T4, it's about 800k PPD, so one of the machines with 8 cores and 4x T4s got about 3.2M PPD (I measured values from 2.8M - 3.8M PPD depending on the units and specific workload.)

     

    Hope that answers your question, don't hesitate to ask if you want more details on something.

  2. 1 minute ago, Bitter said:

    I got the drivers sorted, still seems low, have a proper temp reading now. I wonder if it's got a modded BIOS on it since it's a used mining card. I'll have to figure that out later, just let it do it's 24hr work unit then mess with it some. Also get some slim fans so it's not a 3 slot card!

    Yeah thats more or less what I meant, since your using a mining card I don't know what the drivers look like on linux (I've never had a mining card) but if your using standard drivers (GTX version I mean) for the mining card, there's probably some miss-match or inefficiency somewhere on there. Nvidia always can be tricky to get right under linux.
    Do you mean modded BIOS as in a different one from the regular GPUs using a TPU 106 die? (So that would be a 1060 if I'm not mistaken right?)

  3. Just now, Bitter said:

    Screwed some AMD fans to the Nvidia P106, popped it in a slot, and it's up and running. It's WAY down on PPD but I think that's cause it's in a PCIe 2.0 x8 slot possibly and also I think the wrong driver version on Ubuntu possibly. Trying to install the more correct driver I'll probably break the GTX 650 driver doing so. But it does work! Not getting temps through sensors for some reason, likely drivers? It feels cool to the touch which makes me think it's not doing nearly as much work as it should be doing.

    I'm guessing it's the drivers. PCIe bandwidth isn't really an issue for folding. I have an old GTX 770 running in a probably broken slot that only get 3.0 x1 bandwidth and it performs as it should/did before it was bumped by the upgrade.

  4. 2 minutes ago, Boyce1 said:

    Hopefully an easy question...

    I just got a video card and installed it.  It's still only picking up CPU WUs after about 24ish hours.  Is there anything I need to do to let them know they can use video too.  Is there still somewhat of a shortage of GPU WUs or do they just like my CPU better than my video card that is only an RTX 2060 Super with all the much faster video cards out there?  I don't care which type of WUs I get just want to make sure they can use whichever they want.

    Thanks!

    First of all, yes there probably still is some GPU WU shortage out there. I'm guessing your GPU is showing up on either the web-interface or the advance client right? If not you need to add another slot in the advanced config for your GPU. You should be able to do both CPU and GPU WU at the same time, so either be patient and wait to get a WU or you could set the client-type to advanced or even beta in the advanced client under the expert tab. Does that make sense? If not just tell me what you need clarification on.

  5. 43 minutes ago, GOTSpectrum said:

    I will be writing the awards ceremony this afternoon, it is currently 1030 here so expect it in the next 12 hours or so. 

     

    I'm still trying to work out the best way to deal with 700+ prizes, as in distribution. I will probably do what I always do and ask people to DM me. 

    Great, but no rush. Enjoy your weekend a bit as well 😉

  6. 1 minute ago, Dutch_Master said:

     

    I tried this and it appears to have connected via Telnet (port 36330). Not sure it'll work, for most boxes now have WUs for more then 8 hrs!

    I'll keep that in mind if the other one fails. (not that I'm using sudo, but I'll get around that ;) )

     

    Thx guys, much appreciated!

    Yep, that's how all the communication for FAH works, through telnet at what ever port you have configured, 36330 per default. That really should work, otherwise you have some other problem, it's never failed me and I've used it 10s to 100s of times.

  7. 1 minute ago, jctappel67 said:

    It's the whole 'Free Tier' part. I looked it up, and I'm pretty sure that once you've upgraded to 'Paid Tier' that limitation is removed.

     

    Here's a link to Google's page on the 'Free Tier': https://cloud.google.com/free/docs/gcp-free-tier#how-to-upgrade

    Ok, seems like your probably right. Are you gonna try it? I mean like in the next couple of days? I don't want to take the risk as everything is working right now and I don't want to waist PPD in trying to convert, when I don't know if it will work the way I want it to, but I'd be very interested in hearing from you once you've tried it.

  8. 5 hours ago, Alex Atkin UK said:

    Seems a moot point, I'm getting 0 GPU availability at Google Cloud. :(

     

    Its annoying you can't just say "I want this hardware config, I don't care where".

     

    5 hours ago, Alex Atkin UK said:

    Yeah I was struggling to get CPUs too, it said 0 availability even when I selected 16.  They make it WAY too hard to just setup a VM when it honestly doesn't matter WHERE its located.

    Interesting I haven't had any problems getting both 2x P100 and 2x ( 4x T4 ), which I'm switching to at the moment. Never had a single request denied because of capacity, only had to make sure I had the correct quotas in the correct area. I got all of mine at US-west-1, so Oregon.

  9. 2 hours ago, jctappel67 said:

    I figured it out! Thanks to @Macaw2000's advice. I am deleting the old 4 VMs as they finish the current WUs and replacing them with 6 preemptible VMs! 

     

    Gonna be a pain in the A$$ to manage for these last 2 days, but it'll be worth it!

     

    Hopefully what remains of my credit can last that long as well

     

    1 hour ago, Macaw2000 said:

    Nice! Folding is an embarrassingly good application for this. You have a non-mission critical workload that is compute intensive. AWS calls it spot, Google calls it preemptible, and Azure calls it Low-priority but all essentially the same thing. Yeah it can be a pain to manage if you don't build some automation but you are on the right track.

    Yes, it would be however sadly you can't use your Free Credits here. It's one of the limitations, they don't want you to use your credits too efficiently. 😏
    https://cloud.google.com/compute/docs/instances/preemptible
    image.png.95781a758b5d5b99f0ac35722361eb06.png

  10. 1 hour ago, bafo_ah said:

    Hope you using centos, paste line by line, its all setup, or you can use this https://cloud.google.com/compute/docs/gpus/install-drivers-gpu

     

      Reveal hidden contents

    sudo dnf install -y epel-release
    sudo dnf install -y nano wget telnet fail2ban fail2ban-systemd
    # install fail2ban for security? ddos prevention? just to lazy to change ssh port?
    sudo systemctl enable fail2ban
    sudo nano /etc/fail2ban/jail.local
    # paste this line :
    [DEFAULT]
    # Ban hosts for one day!:
    bantime = 86400

    # Override /etc/fail2ban/jail.d/00-firewalld.conf:
    banaction = iptables-multiport

    [sshd]
    enabled = true

    # ctrl+x for exit and then y then enter

    sudo systemctl restart fail2ban
    sudo fail2ban-client status sshd

    sudo dnf install -y kernel
    sudo dnf install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r)
    sudo dnf install -y http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-repo-rhel8-10.1.243-1.x86_64.rpm
    sudo dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
    sudo dnf clean all
    sudo dnf install -y cuda
    nvidia-smi # you should see your GPU here, if not, you screwed

    wget https://download.foldingathome.org/releases/public/release/fahclient/centos-5.3-64bit/v7.4/fahclient-7.4.4-1.x86_64.rpm
    sudo rpm -i --nodeps fahclient-7.4.4-1.x86_64.rpm

    telnet localhost 36330
    slot-info
    options user=usernamehere passkey=yourpasskeyhere team=223518 client-type=beta max-packet-size=big power=full gpu=true
    exit

    sudo /etc/init.d/FAHClient stop
    # wait a moment, cross your finger... then proceed
    sudo /etc/init.d/FAHClient start

    telnet localhost 36330
    slot-info
    # you should see your GPU here, and we can delete CPU slot
    slot-delete 0
    always_on
    unpause
    exit

    tail -f /var/lib/fahclient/log.txt

     

    If you just use one of their Deep Learning Linux, you will be prompted if you wanna install the Nvidia drivers on the first boot and it does everything automatically. Then you just need to download the .deb file and install it. I think this is much simpler.

  11. 13 minutes ago, DonGuano said:

    What is this client type thing? And how to set it up?

    You can set your client to advanced or beta mode to get WU that aren't (quite) validated yet and have a higher chance of failing, but there is more available work so you will most likely see a more consistent stream of WUs.

    You enable it by going into the advanced control application and then to the settings of you client, under advanced you can set an extra option called client-type and value of advanced. If you only want it for a certain slot you can also add it in your slot configuration at the bottom under extra options.

  12. 11 hours ago, Plexas said:

    What else would you use to search? It probably bugged out, doesnt matter its fine :) 

    Make sure you wait for the sheet to load completely and then click the document before trying to search with Ctrl + F otherwise the browsers search gets used, and only the beginning of the document gets loaded and can be searched by the browser, but Google Sheets search can search the whole document.

  13. 1 hour ago, damnfinecoffee said:

    Sorry, I posted the wrong IP's. These are the one's being blocked:

    
    185.216.140.34
    185.175.93.11

     

    Yeah, as far as I can tell these don't seem to be used by FAH. However I can't be 100% certain, at least one of them seems to be running some sort of completely unrelated website, the other one doesn't have anything on it to support the conjecture that it's running for the FAH network. To me these seem to be unrelated IP blocks. Neither of them is mentioned on the FAH server list or their collection servers.

     

    In conclusion it's highly unlikely that these have anything to do with FAH, I would guess that the problem is somewhere else.

×