Jump to content

Budget (including currency): 

Country: 

Games, programs or workloads that it will be used for: 

Other details (existing parts lists, whether any peripherals are needed, what you're upgrading from, when you're going to buy, what resolution and refresh rate you want to play at, etc): 

$US looking for a cost efficient way to run multiple GPUs. I'm looking for performance per dollar. My max can range from $3000-5000 for a dual-GPU config with some upgrade ability, similar to a lower end TinyBox.

I already have some decent computers, an M1 macbook, and a 5800X, 3060Ti build. I could just ssh into this new build, so it doesn't have to be a desktop PC. Though I would prefer being able to play 1080p low setting fps games still get more than 240 fps, really only excluding old zeon processors.

 

Speed of the GPU's is not as important as the VRAM, I'll only be fine-tuning and inferencing on the GPU's, all real training will be on servers.

My main concern is the PCI-E lanes supported on most Desktop CPU's, using some PCI-E risers I can fit 4 GPU's on an AM5 motherboard. The most a desktop CPU, the 7950x or 7950x3d can support is 28 PCI-E lanes, so I would need 8x to 16x. I know absolutely nothing on how this effects performance, power consumption, etc. After this price range is the threadripper, xeon and prefer not to go into it if I don't have to. Cuda is not an issue, since most of my work will be with pytorch and for any lower level cuda work can be done in a separate environment.

 

Here are my 2 builds I have right now, RAM is ridiculously priced. My reasoning for the GPUs is the gamers hatred the 4060ti 16gb drove the price down so much, VRAM and TDP are good for the price. 7900xtx is just a great card.

If anyone has a similar build what are your thoughts on your build and how it could be optimized if you could redo it? Would you use a consumer, workstation, or server CPU?

 

Screenshot2024-03-13at12_18_18PM.thumb.png.350072c409ef876727540ca708870178.pngScreenshot2024-03-13at12_17_54PM.thumb.png.503271fa813c821b3549ddfa65e96088.png

Link to comment
https://linustechtips.com/topic/1562421-multi-gpu-build-for-nlpllm-development/
Share on other sites

Link to post
Share on other sites

If you are going to do fine-tuning, then going for x8 instead of x4 is likely going to be better, specially if you're working with larger models that will require those 72gb of vram.

Aren't used 3090s an option? Two of those could do you good.

 

Also, be aware that there are only few AM5 Motherboards that allow you to do x8/x8 on its slots, the one you did you'd need to do tons of hacks with risers on top of risers to split those lanes.

 

I'd avoid the 7900 xtx for your usecase, rocm is still a pain for some stuff.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to post
Share on other sites

16 hours ago, igormp said:

If you are going to do fine-tuning, then going for x8 instead of x4 is likely going to be better, specially if you're working with larger models that will require those 72gb of vram.

Aren't used 3090s an option? Two of those could do you good.

 

Also, be aware that there are only few AM5 Motherboards that allow you to do x8/x8 on its slots, the one you did you'd need to do tons of hacks with risers on top of risers to split those lanes.

 

I'd avoid the 7900 xtx for your usecase, rocm is still a pain for some stuff.

Thank you for the motherboard point out, that saved me from a potential disaster. The more I get into this, the more it's looking like a threadripper is the best long term option, it's just so expensive with no real use case other than tinkering/research. I'm still a student/intern, if I can get some help funding, maybe promise a useful model, program, or paper to the school I'll go for it.

 

I read a lot of good things about AMD's progress in ML and see it mostly supported in a lot of libraries and those inference repos(llama.cpp,etc..), but it could be a small dog loud bark thing going on, idk I've never tested it.

Link to post
Share on other sites

5 minutes ago, carter_ said:

I'm still a student/intern, if I can get some help funding, maybe promise a useful model, program, or paper to the school I'll go for it.

Honest question here. How much CPU power do you actually need, or do you just need the extra PCIE lanes?

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to post
Share on other sites

2 hours ago, carter_ said:

Thank you for the motherboard point out, that saved me from a potential disaster. The more I get into this, the more it's looking like a threadripper is the best long term option, it's just so expensive with no real use case other than tinkering/research. I'm still a student/intern, if I can get some help funding, maybe promise a useful model, program, or paper to the school I'll go for it.

 

I read a lot of good things about AMD's progress in ML and see it mostly supported in a lot of libraries and those inference repos(llama.cpp,etc..), but it could be a small dog loud bark thing going on, idk I've never tested it.

If you want to tinker more with getting stuff to work than actually getting stuff done, then go with AMD. Otherwise, for an (almost) out of the box experience you'd be better off with nvidia.

 

What models are you planning to work with? I personally have an AM4 setup with 2x3090s and it serves me more than fine for local training/inference, and I can always jump into a proper A100 cluster for anything larger.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to post
Share on other sites

9 hours ago, IkeaGnome said:

Honest question here. How much CPU power do you actually need, or do you just need the extra PCIE lanes?

I don't think much. The CPU should have little impact, once data is copied to the GPU it doesn't leave memory until its freed or the process is over. This response led me on a google search where I found the AMD Epyc 7203 and 7303 chips, mobo+cpu for under $1000 looks pretty good. Intel product labeling is foreign to me, if there's a better solution lmk.

Link to post
Share on other sites

7 hours ago, igormp said:

If you want to tinker more with getting stuff to work than actually getting stuff done, then go with AMD. Otherwise, for an (almost) out of the box experience you'd be better off with nvidia.

 

What models are you planning to work with? I personally have an AM4 setup with 2x3090s and it serves me more than fine for local training/inference, and I can always jump into a proper A100 cluster for anything larger.

I've been working with SWE-Llama-7b, Mistral-7B-Instruct-v0.2, and mac bounties for tinygrad. This is manageable within my current machines or Colab but I accumulated a good amount of quality data (and funding this project) from doing one of those online RLHF farms for code based LLMs and would like to start building towards a 8x7B architecture model pretty soon. To start making real progress I need at least full precision. Your setup of dual 3090s is my best bet, I can just plug them right in with a new PSU and slot them in a new system when needed. Thanks for your advice btw, brought my plans back down to earth.

Link to post
Share on other sites

12 hours ago, carter_ said:

I don't think much. The CPU should have little impact, once data is copied to the GPU it doesn't leave memory until its freed or the process is over. This response led me on a google search where I found the AMD Epyc 7203 and 7303 chips, mobo+cpu for under $1000 looks pretty good. Intel product labeling is foreign to me, if there's a better solution lmk.

What specific programs are you using?

If CPU has little to no use, but you need quite a bit of ram then Intel's x299 platform might be a happy medium on price if you're willing to go used.

ASRock Taichi X299 Motherboard LGA 2066 With intel Core i9-10940X CPU Combo | eBay

Something like that would still give you a 14 core 28 thread cpu. Yes there are faster CPUs out there, but the 10940x isn't that bad.

Intel® Core™ i9-10940X X-series Processor 

Intel Core i9-10940X Review | bit-tech.net

It would also be non server platform like an Epyc. Once you start looking at Epyc cpus your cooler choice gets a bit slim and tends to be coolers designed for server cases. Smaller fans, louder fans etc. If the computer is going to be close to you this could be annoying.

 

With that CPU and that motherboard you wouldn't get full x16 lanes to all of the GPUs if you went with 4. 3 would get x8 and 1 would get x16.

image.png.51a2e9089c96004b837706ec3c117717.png

That CPU tops out at 256gb ram and it has quad channel capabilities and 8 dimm slots. It opens up a bit more flexibility. 

 

The only thing that worries me about that listing is the discrepancies. It does say 10940x in the title, but then in the description they have the 9960x as the CPU. Both are supported by the motherboard. However, I'd rather it with the 10940x. The 9960x would allow for the same GPU configuration, but tops out at 128gb ram. 

The 9960x goes for ~$300 used so if it is the 9960x that is in that combo it's still not a bad deal, I'd probably just keep looking though or see if the seller would accept a lower offer. 

 

The big thing is cooler compatibility to me. Most any cooler that works on AM4, LGA 115x, LGA 1200, and 1700 will work on 2066. 

I'm not actually trying to be as grumpy as it seems.

I will find your mentions of Ikea or Gnome and I will /s post. 

Project Hot Box

CPU 13900k, Motherboard Gigabyte Aorus Elite AX, RAM CORSAIR Vengeance 4x16gb 5200 MHZ, GPU Zotac RTX 4090 Trinity OC, Case Fractal Pop Air XL, Storage Sabrent Rocket Q4 2tbCORSAIR Force Series MP510 1920GB NVMe, CORSAIR FORCE Series MP510 960GB NVMe, PSU CORSAIR HX1000i, Cooling Corsair XC8 CPU block, Bykski GPU block, 360mm and 280mm radiator, Displays Odyssey G9, LG 34UC98-W 34-Inch,Keyboard Mountain Everest Max, Mouse Mountain Makalu 67, Sound AT2035, Massdrop 6xx headphones, Go XLR 

Oppbevaring

CPU i9-9900k, Motherboard, ASUS Rog Maximus Code XI, RAM, 48GB Corsair Vengeance LPX 32GB 3200 mhz (2x16)+(2x8) GPUs Asus ROG Strix 2070 8gb, PNY 1080, Nvidia 1080, Case Mining Frame, 2x Storage Samsung 860 Evo 500 GB, PSU Corsair RM1000x and RM850x, Cooling Asus Rog Ryuo 240 with Noctua NF-12 fans

 

Why is the 5800x so hot?

 

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×