Jump to content

Has anyone tried to train a model using two RTX 5060 Ti 16GB? I plan to use them to train models using DDP or self-hosting LLMs.

 

Never try that before, is it possible??

 

If yes, I have made a list.

https://pcpartpicker.com/user/kcheeseng/saved/Z76Jt6

 

Any recommendations, or are there any issues with the performance?

image.png

Link to comment
https://linustechtips.com/topic/1615176-two-rtx-5060ti-16gb-for-ai-model-training/
Share on other sites

Link to post
Share on other sites

Given the number of Tensor cores, I wonder if you might be better off just getting a single RTX 5070 Ti... Almost as many Tensor cores, but should be cheaper

I might be experienced, but I'm human and I do make mistakes. Trust but Verify! I edit my messages after sending them alot, please refresh before posting your reply. Please try to be clear and specific, you'll get a better answer. Please remember to mark solutions once you have the information you need. Expand this signature for common PC building advice, a short bio and a list of my components.

 

Common build advice:

1) Buy the cheapest (well reviewed) motherboard that has the features you need. Paying more typically only gets you features you won’t use. 2) only get as much RAM as you need, getting more won’t (typically) make your PC faster. 3) While I recommend getting an NVMe drive, you don’t need to splurge for an expensive drive with DRam cache, DRamless drives are fine for gamers. 4) paying for looks is fine, just don’t break the bank. 5) Tower coolers are usually good enough, unless you go top tier Intel or plan on OCing. 6) OCing is a dead meme, you probably shouldn’t bother. 7) "Bottlenecks" rarely matter and "Future-proofing" is a myth. 8) AIOs don't noticably improve performance past 240mm.

 

Useful Websites:

https://www.productchart.com - helps compare monitors, https://uk.pcpartpicker.com - makes designing a PC easier.

 

Bio:

He/Him - I'm a PhD student working in the fields of reinforcement learning and traffic control. PCs are one of my hobbies and I've built many PCs and performed upgrades on a few laptops (for myself, friends and family). My personal computers include 4 windows (10/11) machines and a TrueNAS server (and I'm looking to move to dual booting Linux Mint on my main machine in future). Aside from computers, I also dabble in modding/homebrew retro consoles, support Southampton FC, and enjoy Scuba Diving and Skiing.

Fun Facts

1) When I was 3 years old my favourite toy was a scientific calculator. 2) My father is a British Champion ploughman in the Vintage Hydraulic Class. 3) On Speedrun.com, I'm the world record holder for the Dream Bobsleigh event on Mario & Sonic at the Olympic Winter Games 2010.

 

My Favourite Games: World of Tanks, Runescape, Subnautica, Metroid (Fusion and Dread), Spyro: Year of the Dragon (Original and Reignited Trilogy), Crash Bash, Mario Kart Wii, Balatro

 

My Computers: Primary: My main gaming rig - https://uk.pcpartpicker.com/user/will0hlep/saved/NByp3C Second: Hosts Discord bots as well as a Minecraft and Ark server, and also serves as a reinforcement learning sand box - https://uk.pcpartpicker.com/user/will0hlep/saved/cc9K7P NAS: TrueNAS Scale NAS hosting SMB shares, DDNS updater, pi-hole, and a Jellyfin server - https://uk.pcpartpicker.com/user/will0hlep/saved/m37w3C Foldatron: My folding@home and BOINC rig (partially donated to me by Folding Team Leader GOTSpectrum) - Mobile: Mini-ITX gaming rig for when I'm away from home -

Link to post
Share on other sites

Is it possible? Depends on the program you use and if they can properly utilize them. Is it worth it? Absolutely not. You are far better off with one much better GPU then two mediocre ones.

 

5070ti should be cheaper to buy then two 5060tis. But this pricing is pretty abysmal for them.

Link to post
Share on other sites

Also, just noticed, why are you going intel 12600KF when your already paying for DDR5. Surely, you should be looking at AM5?

I might be experienced, but I'm human and I do make mistakes. Trust but Verify! I edit my messages after sending them alot, please refresh before posting your reply. Please try to be clear and specific, you'll get a better answer. Please remember to mark solutions once you have the information you need. Expand this signature for common PC building advice, a short bio and a list of my components.

 

Common build advice:

1) Buy the cheapest (well reviewed) motherboard that has the features you need. Paying more typically only gets you features you won’t use. 2) only get as much RAM as you need, getting more won’t (typically) make your PC faster. 3) While I recommend getting an NVMe drive, you don’t need to splurge for an expensive drive with DRam cache, DRamless drives are fine for gamers. 4) paying for looks is fine, just don’t break the bank. 5) Tower coolers are usually good enough, unless you go top tier Intel or plan on OCing. 6) OCing is a dead meme, you probably shouldn’t bother. 7) "Bottlenecks" rarely matter and "Future-proofing" is a myth. 8) AIOs don't noticably improve performance past 240mm.

 

Useful Websites:

https://www.productchart.com - helps compare monitors, https://uk.pcpartpicker.com - makes designing a PC easier.

 

Bio:

He/Him - I'm a PhD student working in the fields of reinforcement learning and traffic control. PCs are one of my hobbies and I've built many PCs and performed upgrades on a few laptops (for myself, friends and family). My personal computers include 4 windows (10/11) machines and a TrueNAS server (and I'm looking to move to dual booting Linux Mint on my main machine in future). Aside from computers, I also dabble in modding/homebrew retro consoles, support Southampton FC, and enjoy Scuba Diving and Skiing.

Fun Facts

1) When I was 3 years old my favourite toy was a scientific calculator. 2) My father is a British Champion ploughman in the Vintage Hydraulic Class. 3) On Speedrun.com, I'm the world record holder for the Dream Bobsleigh event on Mario & Sonic at the Olympic Winter Games 2010.

 

My Favourite Games: World of Tanks, Runescape, Subnautica, Metroid (Fusion and Dread), Spyro: Year of the Dragon (Original and Reignited Trilogy), Crash Bash, Mario Kart Wii, Balatro

 

My Computers: Primary: My main gaming rig - https://uk.pcpartpicker.com/user/will0hlep/saved/NByp3C Second: Hosts Discord bots as well as a Minecraft and Ark server, and also serves as a reinforcement learning sand box - https://uk.pcpartpicker.com/user/will0hlep/saved/cc9K7P NAS: TrueNAS Scale NAS hosting SMB shares, DDNS updater, pi-hole, and a Jellyfin server - https://uk.pcpartpicker.com/user/will0hlep/saved/m37w3C Foldatron: My folding@home and BOINC rig (partially donated to me by Folding Team Leader GOTSpectrum) - Mobile: Mini-ITX gaming rig for when I'm away from home -

Link to post
Share on other sites

Yup, compared to the cores, it’s crap. But I’m thinking of getting 32 GB of VRAM to run or train a larger model. I was previously planning to get a used gpu, like 3090, but the marketplace in my country doesn’t seem trustworthy.

4 minutes ago, will0hlep said:

Given the number of Tensor cores, I wonder if you might be better off just getting a single RTX 5070 Ti... Almost as many Tensor cores, but should be cheaper

 

Link to post
Share on other sites

2 minutes ago, CheeseNg said:

Yup, compared to the cores, it’s crap. But I’m thinking of getting 32 GB of VRAM to run or train a larger model. I was previously planning to get a used gpu, like 3090, but the marketplace in my country doesn’t seem trustworthy.

From my experiance it would be better to just let the model overrun into RAM at that point. Moving data between GPUs is not very efficent.

I might be experienced, but I'm human and I do make mistakes. Trust but Verify! I edit my messages after sending them alot, please refresh before posting your reply. Please try to be clear and specific, you'll get a better answer. Please remember to mark solutions once you have the information you need. Expand this signature for common PC building advice, a short bio and a list of my components.

 

Common build advice:

1) Buy the cheapest (well reviewed) motherboard that has the features you need. Paying more typically only gets you features you won’t use. 2) only get as much RAM as you need, getting more won’t (typically) make your PC faster. 3) While I recommend getting an NVMe drive, you don’t need to splurge for an expensive drive with DRam cache, DRamless drives are fine for gamers. 4) paying for looks is fine, just don’t break the bank. 5) Tower coolers are usually good enough, unless you go top tier Intel or plan on OCing. 6) OCing is a dead meme, you probably shouldn’t bother. 7) "Bottlenecks" rarely matter and "Future-proofing" is a myth. 8) AIOs don't noticably improve performance past 240mm.

 

Useful Websites:

https://www.productchart.com - helps compare monitors, https://uk.pcpartpicker.com - makes designing a PC easier.

 

Bio:

He/Him - I'm a PhD student working in the fields of reinforcement learning and traffic control. PCs are one of my hobbies and I've built many PCs and performed upgrades on a few laptops (for myself, friends and family). My personal computers include 4 windows (10/11) machines and a TrueNAS server (and I'm looking to move to dual booting Linux Mint on my main machine in future). Aside from computers, I also dabble in modding/homebrew retro consoles, support Southampton FC, and enjoy Scuba Diving and Skiing.

Fun Facts

1) When I was 3 years old my favourite toy was a scientific calculator. 2) My father is a British Champion ploughman in the Vintage Hydraulic Class. 3) On Speedrun.com, I'm the world record holder for the Dream Bobsleigh event on Mario & Sonic at the Olympic Winter Games 2010.

 

My Favourite Games: World of Tanks, Runescape, Subnautica, Metroid (Fusion and Dread), Spyro: Year of the Dragon (Original and Reignited Trilogy), Crash Bash, Mario Kart Wii, Balatro

 

My Computers: Primary: My main gaming rig - https://uk.pcpartpicker.com/user/will0hlep/saved/NByp3C Second: Hosts Discord bots as well as a Minecraft and Ark server, and also serves as a reinforcement learning sand box - https://uk.pcpartpicker.com/user/will0hlep/saved/cc9K7P NAS: TrueNAS Scale NAS hosting SMB shares, DDNS updater, pi-hole, and a Jellyfin server - https://uk.pcpartpicker.com/user/will0hlep/saved/m37w3C Foldatron: My folding@home and BOINC rig (partially donated to me by Folding Team Leader GOTSpectrum) - Mobile: Mini-ITX gaming rig for when I'm away from home -

Link to post
Share on other sites

5 minutes ago, will0hlep said:

Also, just noticed, why are you going intel 12600KF when your already paying for DDR5. Surely, you should be looking at AM5?

I hadn’t considered AMD. Does it work better than Intel with DDR5? The reason I’m leaning toward Intel is because I want to benefit from certain libraries, like scikit-learn.

Link to post
Share on other sites

1 minute ago, CheeseNg said:

I hadn’t considered AMD. Does it work better than Intel with DDR5? The reason I’m leaning toward Intel is because I want to benefit from certain libraries, like scikit-learn.

There is definately stuff that dosn't work on AMD GPUs, but I'm not actually aware of any common libraries for this stuff that doesn't work on AMD CPUs.

 

From my experiance of this stuff, if you want performance, you want an AMD CPUs.

I might be experienced, but I'm human and I do make mistakes. Trust but Verify! I edit my messages after sending them alot, please refresh before posting your reply. Please try to be clear and specific, you'll get a better answer. Please remember to mark solutions once you have the information you need. Expand this signature for common PC building advice, a short bio and a list of my components.

 

Common build advice:

1) Buy the cheapest (well reviewed) motherboard that has the features you need. Paying more typically only gets you features you won’t use. 2) only get as much RAM as you need, getting more won’t (typically) make your PC faster. 3) While I recommend getting an NVMe drive, you don’t need to splurge for an expensive drive with DRam cache, DRamless drives are fine for gamers. 4) paying for looks is fine, just don’t break the bank. 5) Tower coolers are usually good enough, unless you go top tier Intel or plan on OCing. 6) OCing is a dead meme, you probably shouldn’t bother. 7) "Bottlenecks" rarely matter and "Future-proofing" is a myth. 8) AIOs don't noticably improve performance past 240mm.

 

Useful Websites:

https://www.productchart.com - helps compare monitors, https://uk.pcpartpicker.com - makes designing a PC easier.

 

Bio:

He/Him - I'm a PhD student working in the fields of reinforcement learning and traffic control. PCs are one of my hobbies and I've built many PCs and performed upgrades on a few laptops (for myself, friends and family). My personal computers include 4 windows (10/11) machines and a TrueNAS server (and I'm looking to move to dual booting Linux Mint on my main machine in future). Aside from computers, I also dabble in modding/homebrew retro consoles, support Southampton FC, and enjoy Scuba Diving and Skiing.

Fun Facts

1) When I was 3 years old my favourite toy was a scientific calculator. 2) My father is a British Champion ploughman in the Vintage Hydraulic Class. 3) On Speedrun.com, I'm the world record holder for the Dream Bobsleigh event on Mario & Sonic at the Olympic Winter Games 2010.

 

My Favourite Games: World of Tanks, Runescape, Subnautica, Metroid (Fusion and Dread), Spyro: Year of the Dragon (Original and Reignited Trilogy), Crash Bash, Mario Kart Wii, Balatro

 

My Computers: Primary: My main gaming rig - https://uk.pcpartpicker.com/user/will0hlep/saved/NByp3C Second: Hosts Discord bots as well as a Minecraft and Ark server, and also serves as a reinforcement learning sand box - https://uk.pcpartpicker.com/user/will0hlep/saved/cc9K7P NAS: TrueNAS Scale NAS hosting SMB shares, DDNS updater, pi-hole, and a Jellyfin server - https://uk.pcpartpicker.com/user/will0hlep/saved/m37w3C Foldatron: My folding@home and BOINC rig (partially donated to me by Folding Team Leader GOTSpectrum) - Mobile: Mini-ITX gaming rig for when I'm away from home -

Link to post
Share on other sites

7 minutes ago, will0hlep said:

From my experiance it would be better to just let the model overrun into RAM at that point. Moving data between GPUs is not very efficent.

Never heard about that. How does it work during model training? Does it automatically decide what to offload, or do we need to configure it manually?

Link to post
Share on other sites

1 minute ago, CheeseNg said:

Never heard about that. How does it work during model training? Does it automatically decide what to offload, or do we need to configure it manually?

For any work I've done, it is automatic.

 

It does add latency cause RAM is slower than VRam, but you'll have similar issues with storing the model accross multiple GPUs, unless the model is specifically Parallised to work across multiple GPUs.

I might be experienced, but I'm human and I do make mistakes. Trust but Verify! I edit my messages after sending them alot, please refresh before posting your reply. Please try to be clear and specific, you'll get a better answer. Please remember to mark solutions once you have the information you need. Expand this signature for common PC building advice, a short bio and a list of my components.

 

Common build advice:

1) Buy the cheapest (well reviewed) motherboard that has the features you need. Paying more typically only gets you features you won’t use. 2) only get as much RAM as you need, getting more won’t (typically) make your PC faster. 3) While I recommend getting an NVMe drive, you don’t need to splurge for an expensive drive with DRam cache, DRamless drives are fine for gamers. 4) paying for looks is fine, just don’t break the bank. 5) Tower coolers are usually good enough, unless you go top tier Intel or plan on OCing. 6) OCing is a dead meme, you probably shouldn’t bother. 7) "Bottlenecks" rarely matter and "Future-proofing" is a myth. 8) AIOs don't noticably improve performance past 240mm.

 

Useful Websites:

https://www.productchart.com - helps compare monitors, https://uk.pcpartpicker.com - makes designing a PC easier.

 

Bio:

He/Him - I'm a PhD student working in the fields of reinforcement learning and traffic control. PCs are one of my hobbies and I've built many PCs and performed upgrades on a few laptops (for myself, friends and family). My personal computers include 4 windows (10/11) machines and a TrueNAS server (and I'm looking to move to dual booting Linux Mint on my main machine in future). Aside from computers, I also dabble in modding/homebrew retro consoles, support Southampton FC, and enjoy Scuba Diving and Skiing.

Fun Facts

1) When I was 3 years old my favourite toy was a scientific calculator. 2) My father is a British Champion ploughman in the Vintage Hydraulic Class. 3) On Speedrun.com, I'm the world record holder for the Dream Bobsleigh event on Mario & Sonic at the Olympic Winter Games 2010.

 

My Favourite Games: World of Tanks, Runescape, Subnautica, Metroid (Fusion and Dread), Spyro: Year of the Dragon (Original and Reignited Trilogy), Crash Bash, Mario Kart Wii, Balatro

 

My Computers: Primary: My main gaming rig - https://uk.pcpartpicker.com/user/will0hlep/saved/NByp3C Second: Hosts Discord bots as well as a Minecraft and Ark server, and also serves as a reinforcement learning sand box - https://uk.pcpartpicker.com/user/will0hlep/saved/cc9K7P NAS: TrueNAS Scale NAS hosting SMB shares, DDNS updater, pi-hole, and a Jellyfin server - https://uk.pcpartpicker.com/user/will0hlep/saved/m37w3C Foldatron: My folding@home and BOINC rig (partially donated to me by Folding Team Leader GOTSpectrum) - Mobile: Mini-ITX gaming rig for when I'm away from home -

Link to post
Share on other sites

9 hours ago, CheeseNg said:

Has anyone tried to train a model using two RTX 5060 Ti 16GB? I plan to use them to train models using DDP or self-hosting LLMs.

 

Yeah, it'd work just fine. Not the fastest, but it'd still work nonetheless.

9 hours ago, will0hlep said:

Given the number of Tensor cores, I wonder if you might be better off just getting a single RTX 5070 Ti... Almost as many Tensor cores, but should be cheaper

Half the vram in total limits one quite a lot, depending on what they're doing.

9 hours ago, Shimejii said:

Is it possible? Depends on the program you use and if they can properly utilize them. Is it worth it? Absolutely not. You are far better off with one much better GPU then two mediocre ones.

 

5070ti should be cheaper to buy then two 5060tis. But this pricing is pretty abysmal for them.

See above.

8 hours ago, will0hlep said:

From my experiance it would be better to just let the model overrun into RAM at that point. Moving data between GPUs is not very efficent.

No way, it's way better to use 4x3060s than a single 16GB GPU that won't be able to fit your entire model. For example, a 3060 with its 12gb was often better than a 3080 simple because it had an extra 2gb of vram and could run larger models this way.

If you don't have enough vram to begin with, then you're SOL. Get a reasonable amount of vram that fits your model, then look into the fastest solution you can afford with this amount of memory (or more).

 

8 hours ago, CheeseNg said:

I hadn’t considered AMD. Does it work better than Intel with DDR5? The reason I’m leaning toward Intel is because I want to benefit from certain libraries, like scikit-learn.

There are no benefits in using intel with scikit. Zen 5 also has a pretty amazing AVX-512 unit, which consumer Intel CPUs lack, so its way faster at ML-related tasks.

8 hours ago, CheeseNg said:

Never heard about that. How does it work during model training? Does it automatically decide what to offload, or do we need to configure it manually?

No, it's crap, don't do that.

 

 

8 hours ago, will0hlep said:

For any work I've done, it is automatic.

 

It does add latency cause RAM is slower than VRam, but you'll have similar issues with storing the model accross multiple GPUs, unless the model is specifically Parallised to work across multiple GPUs.

I know you're trying to help, but this is actual misinformation.

Spilling your model into RAM is AWFUL, and it's not the case for training.

All models can be made to run across multiple GPUs, so that's the norm. Both for inference and training. The impact in having models across GPUs is FAR lesser than having stuff spilling into RAM, since you're just doing gradient accumulation or something of the sorts across the PCIe bus, and not full blown weights swapping or running the feed-forward within the CPU.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×