Jump to content

Advice and building and setting up server for machine learning.

Budget (including currency): $1,300

Country: USA

Games, programs or workloads that it will be used for: AI-Machine learning

Other details Build a solid reliable platform to learn Machine learning programs. LLMs, Image and video generators etc. 

 

I am building a budget server to run AI and I have no experience running AI software. I'm thinking starting with Llama LLM, but would like to get into making AI pictures and videos as well plus who knows what else once I learn more about this. I am just getting into this and have not received the hardware yet but it is ordered. I'm just gathering information so I know how to get started when it gets here.

 

System specs:

 

Dual E5 2686 V4 (32 cores, 72 threads total)

 

128GB ECC RAM

 

2TB Gen 4 NVME SSD (didn't order, already on hand. Not included in budget)

 

(4) 1TB SATA SSDs in RAID 0 (didn't order, already on hand. Not included in budget)

 

(4) Tesla P40 24Gb cards (uses the GP102 chip, same as the Titan XP and 1080TI)

 

I'm planning to run this headless and remote into it. This is just for tinkering at home and I'm not worried if it isn't the fastest system in the world.

 

What would be the best OS?

 

What drivers are the best to use with the Tesla P40 cards?

 

Any other thoughts on this setup, or suggestions?

 

Do I need to use NV link on the cards in order to use all the VRAM?

 

I am thinking of using bifurcation and running each card on 8 PCIE gen 3 lanes, Do you think that would cause a bottleneck?

Link to comment
Share on other sites

Link to post
Share on other sites

It might have been a good idea to start a bit smaller, and also figure out what you are going to do with the hardware before buying it... 

 

For example putting a 3060 inside your current PC would have been a lot cheaper and easier.

Link to comment
Share on other sites

Link to post
Share on other sites

I'm no expert, I just dabble with my gaming PC, but Tesla cards come up a lot. They are cheap for the specs but you do get what you pay for, there are various compatibility issues. When they work they work.

IMO there's no particular reason to RAID0 the SSDs, your bottleneck is in generation not writing.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, LAwLz said:

It might have been a good idea to start a bit smaller, and also figure out what you are going to do with the hardware before buying it... 

 

For example putting a 3060 inside your current PC would have been a lot cheaper and easier.

I tend to over do any new hobby. 😆

 

From what I have read Llama 2 requires 48GB of VRAM. Also, I don't really know what I want to do until I learn what it can do and didn't want to be limited by hardware. 

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, thevictor390 said:

I'm no expert, I just dabble with my gaming PC, but Tesla cards come up a lot. They are cheap for the specs but you do get what you pay for, there are various compatibility issues. When they work they work.

IMO there's no particular reason to RAID0 the SSDs, your bottleneck is in generation not writing.

The main reason I'm going to RAID 0 them is too have it show up as a single 4TB drive rather then needing to split up my files and save them to individual drives. 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Jeeperforlife said:

The main reason I'm going to RAID 0 them is too have it show up as a single 4TB drive rather then needing to split up my files and save them to individual drives. 

There are a bunch of ways to do that without RAID. The problem with RAID0 is you lose one drive you lose them all.

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, thevictor390 said:

There are a bunch of ways to do that without RAID. The problem with RAID0 is you lose one drive you lose them all.

True, but I'm not that concerned about data loss. I can do a nightly backup to my server.

Link to comment
Share on other sites

Link to post
Share on other sites

19 hours ago, Jeeperforlife said:

Dual E5 2686 V4 (32 cores, 72 threads total)

Lots of slow, power hungry cores. If your focus is solely ML then you'd be better with a newer, single CPU with enough memory channels and that isn't as power hungry.

Not a problem if you got it for cheap tho.

19 hours ago, Jeeperforlife said:

128GB ECC RAM

 

That seems on the low end given that you'll have 96gb of vram. I'd try to at least double it.

19 hours ago, Jeeperforlife said:

(4) Tesla P40 24Gb cards (uses the GP102 chip, same as the Titan XP and 1080TI)

That's going to be slow due to the lack of tensor cores, and also no proper FP16 speedups. A 3060 would be able to run models as bigs as this GPU (with mixed precision), while being way faster. But again, if you got it for cheap that's not a problem.

19 hours ago, Jeeperforlife said:

What would be the best OS?

 

Any linux distro of your liking. Maybe just go ubuntu since there are tons of tutorials on how to setup cuda with it.

19 hours ago, Jeeperforlife said:

What drivers are the best to use with the Tesla P40 cards?

 

If you're using linux, there's no distinction between the drivers, just install the nvidia proprietary one and off you go.

19 hours ago, Jeeperforlife said:

Do I need to use NV link on the cards in order to use all the VRAM?

No. Your ML frameworks will do this job. For inference NVLink won't bring much of a benefit, but you can notice a minor speedup if you plan to do training/fine-tuning.

19 hours ago, Jeeperforlife said:

I am thinking of using bifurcation and running each card on 8 PCIE gen 3 lanes, Do you think that would cause a bottleneck?

Yes, but not that major, having twice the vram and GPUs will offset this (as long as you work with large enough models).

19 hours ago, Jeeperforlife said:

Llama 2 requires 48GB of VRAM

Depends on the model size and if you're using quantization or not. You can even run the 70b model with 4-bit quant. I mostly use the 30~35b models for fine-tuning with 2x3090s without problems.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×