Jump to content

Looking at picking up a Tesla card for my Poweredge R720 and had a couple questions.

  1. Do I need dual 1100W PSUs to run the Tesla Cards? 
  2. Can I get the Tesla's to offer GPU acceleration to a VM when using something like RDP? (Plan to drive multiple monitors, not super familiar with how RDP works in terms of that)
  3. What Tesla Card offers the best price to performance? Looking at the K80/M40/M60/P40/P100 cards. 
  4. Is it worth upgrading to a Tesla card over a 1050TI? Ive heard they play nicer with VMs. I currently have my 1050Ti In passthrough to one of my Windows VM's. 

Here are the current specs: 

  • 2x Xeon E5-2660 v2 - 20c / 40t combined
  • 128GB RAM
  • GTX 1050ti
  • 2x 750W PSUs
  • VMWare ESXi 6.7u3

Thoughts? 

Im just thinking about it for now. Not committed to it yet. 

 

 

Breaking things 1 day at a time

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/
Share on other sites

Link to post
Share on other sites

If you just want to play games on one VM, stick with the card you already have. Those cards are all 8 to 10 years old, and they're not even good for doing any AI stuff because they all lack Tensor cores.

 

Jeff from Craft Computing has a bunch of videos about all the hoops he has to jump through to make these cards game.

 

 

 

6 minutes ago, TubsAlwaysWins said:

Do I need dual 1100W PSUs to run the Tesla Cards? 

You'll need at least one. You're looking at 250 watt GPUs on top of your dual 95 watt processors.

 

You will also need to get the right PCIe riser board with a single x16 slot and a power connector. I believe the part number is CPVNF.

 

I've fit dual OEM RTX 3060s into an R730, which is a very similar chassis. They got a little warm because they're not designed for flow-through cooling, but they work. Their only shortcoming is that they don't officially support chopping them up into slices for multiple VMs.

I sold my soul for ProSupport.

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16553674
Share on other sites

Link to post
Share on other sites

38 minutes ago, Needfuldoer said:

for doing any AI stuff because they all lack Tensor cores.

That and their gaming performance is low. The p100 here is basically a rx580. When it works which is not always.

 

It also doesnr even support all the current cuda features anymore due to age.

 

They seem to go for 200$-250$ you can get MUCH better used for that. Hell a 1080ti is achievable. Else well if you REALLY need ai acceleration the 2060 12gb is available for the price and the 3060 12gb can also be found used for that.

 

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16553721
Share on other sites

Link to post
Share on other sites

On 10/23/2024 at 9:24 AM, Needfuldoer said:

If you just want to play games on one VM, stick with the card you already have. Those cards are all 8 to 10 years old, and they're not even good for doing any AI stuff because they all lack Tensor cores.

 

Jeff from Craft Computing has a bunch of videos about all the hoops he has to jump through to make these cards game.

 

 

 

You'll need at least one. You're looking at 250 watt GPUs on top of your dual 95 watt processors.

 

You will also need to get the right PCIe riser board with a single x16 slot and a power connector. I believe the part number is CPVNF.

 

I've fit dual OEM RTX 3060s into an R730, which is a very similar chassis. They got a little warm because they're not designed for flow-through cooling, but they work. Their only shortcoming is that they don't officially support chopping them up into slices for multiple VMs.

Not looking for gaming just a good workstation card. I believe I already have the PCIe Risers I need, I just need to purchase the power cables. 

 

Breaking things 1 day at a time

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16554640
Share on other sites

Link to post
Share on other sites

On 10/23/2024 at 10:07 AM, jaslion said:

That and their gaming performance is low. The p100 here is basically a rx580. When it works which is not always.

 

It also doesnr even support all the current cuda features anymore due to age.

 

They seem to go for 200$-250$ you can get MUCH better used for that. Hell a 1080ti is achievable. Else well if you REALLY need ai acceleration the 2060 12gb is available for the price and the 3060 12gb can also be found used for that.

 

Alright Ill look into maybe a 1080 card or something. I think a roomate has a 2060 I can use. 

 

Breaking things 1 day at a time

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16554641
Share on other sites

Link to post
Share on other sites

Just now, TubsAlwaysWins said:

Alright Ill look into maybe a 1080 card or something. I think a roomate has a 2060 I can use. 

Your biggest limitation will be cooler size. You can't go much bigger than a two-slot reference card because the heatsink faces "down" toward the motherboard, and you're height-limited by the two cards being next to each other. (And if you have an R720XD with rear 2.5" bays, they'll eat into one slot's available space.) That's the reason I went with OEM 3060s (that and the heatsnk fins point front-to-back instead of up-and-down).

 

You should just need the GPU power cable, part number 9H6FV, to feed any consumer video card that takes up to 6 + (6+2) PCIe power.

I sold my soul for ProSupport.

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16554646
Share on other sites

Link to post
Share on other sites

16 minutes ago, Needfuldoer said:

Your biggest limitation will be cooler size. You can't go much bigger than a two-slot reference card because the heatsink faces "down" toward the motherboard, and you're height-limited by the two cards being next to each other. (And if you have an R720XD with rear 2.5" bays, they'll eat into one slot's available space.) That's the reason I went with OEM 3060s (that and the heatsnk fins point front-to-back instead of up-and-down).

 

You should just need the GPU power cable, part number 9H6FV, to feed any consumer video card that takes up to 6 + (6+2) PCIe power.

Yeah no chance a 3 slot card is fitting. I do not have the XD variant so I can slap 2 2 slot GPUs in if I wanted. 

The reason I was looking at Teslas is mainly because their cooling setup works well in this chassis. 

 

Thanks for the part number!

 

Breaking things 1 day at a time

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16554664
Share on other sites

Link to post
Share on other sites

On 10/23/2024 at 4:07 PM, jaslion said:

That and their gaming performance is low. The p100 here is basically a rx580. When it works which is not always.

 

It also doesnr even support all the current cuda features anymore due to age.

 

They seem to go for 200$-250$ you can get MUCH better used for that. Hell a 1080ti is achievable. Else well if you REALLY need ai acceleration the 2060 12gb is available for the price and the 3060 12gb can also be found used for that.

 

Well, the P100 is effectively a 1080 with much faster HBM2 RAM, and 12Gb or 16GB of it. I have two of them here, and they're actually significantly faster than a 1080 when running LLMs - mainly because memory bandwidth one of the biggest indicators of LLM performance.

 

Obviously you're never going to get H100-like performance out of them, but I've had 30+ tokens/s when running 20GB GGUFs across both cards - that's totally usable, and something you couldn't do with anything short of a 3090.

 

I don't know what the prices are like over in the US on the second hand market, but I bought a pair of P100s for £250, and a 3090 is over £600 here.

 

They're also nicely handy in server chassis, because they take EPS 8-pin power rather than PCIE 6+2.

 

I'm not saying they're the ideal solution - I mean, they'll generally pull about 210W each when running inference - but they're not the crazy useless solution a lot of folk make them out to be either.

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16554880
Share on other sites

Link to post
Share on other sites

17 hours ago, digitalscream said:

Well, the P100 is effectively a 1080 with much faster HBM2 RAM, and 12Gb or 16GB of it. I have two of them here, and they're actually significantly faster than a 1080 when running LLMs - mainly because memory bandwidth one of the biggest indicators of LLM performance.

 

Obviously you're never going to get H100-like performance out of them, but I've had 30+ tokens/s when running 20GB GGUFs across both cards - that's totally usable, and something you couldn't do with anything short of a 3090.

 

I don't know what the prices are like over in the US on the second hand market, but I bought a pair of P100s for £250, and a 3090 is over £600 here.

 

They're also nicely handy in server chassis, because they take EPS 8-pin power rather than PCIE 6+2.

 

I'm not saying they're the ideal solution - I mean, they'll generally pull about 210W each when running inference - but they're not the crazy useless solution a lot of folk make them out to be either.

Gonna be honest I dont know what half of the acronyms you said mean but good to know about the performance. 

 

There is a 'Cracked PCB' P100 for $75 on ebay rn... Looks like the crack is just the PCI locking tab... Other than that, looks like they sell for about $300 USD. 

 

Thanks

 

Breaking things 1 day at a time

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16555371
Share on other sites

Link to post
Share on other sites

On 10/25/2024 at 3:55 PM, TubsAlwaysWins said:

Gonna be honest I dont know what half of the acronyms you said mean but good to know about the performance. 

 

There is a 'Cracked PCB' P100 for $75 on ebay rn... Looks like the crack is just the PCI locking tab... Other than that, looks like they sell for about $300 USD. 

 

Thanks

Basically, it's all about AI workloads - that's what I used them for. LLMs (Large Language Models) are effectively the AI model, and they need huge amounts of VRAM when running on GPUs. GGUF (think "lossy compression") is just a way to make them smaller.

 

Worthy of note, for anybody reading along and thinking of using multiple Teslas for that use, is the fact that if you're using multiple GPUs to enable larger models, they don't run in parallel and get better performance. In fact, you get lower performance, because it's effectively two GPUs trying to act as one, and communicating across the PCIE bus is much slower relative to a GPU's internal fabric.

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16557125
Share on other sites

Link to post
Share on other sites

On 10/27/2024 at 6:22 PM, digitalscream said:

Basically, it's all about AI workloads - that's what I used them for. LLMs (Large Language Models) are effectively the AI model, and they need huge amounts of VRAM when running on GPUs. GGUF (think "lossy compression") is just a way to make them smaller.

 

Worthy of note, for anybody reading along and thinking of using multiple Teslas for that use, is the fact that if you're using multiple GPUs to enable larger models, they don't run in parallel and get better performance. In fact, you get lower performance, because it's effectively two GPUs trying to act as one, and communicating across the PCIE bus is much slower relative to a GPU's internal fabric.

How does that work with the K80 cards since they are dual GPU? 

 

Breaking things 1 day at a time

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16563661
Share on other sites

Link to post
Share on other sites

5 hours ago, TubsAlwaysWins said:

How does that work with the K80 cards since they are dual GPU? 

I have no idea - nobody really bothers with the Kepler and Maxwell dual-GPU cards for the workloads that I'm interested in, because they're missing a lot of the feature support required by llama.cpp for the really memory-intensive AI applications. Pascal is really the oldest generation (ie the 10x0 generation) that even mostly gets there. Maxwell Teslas are about half the speed of Pascal, and Kepler are even slower.

 

That said, I think they use something similar to the old SLI bridges to do the job. Don't quote me on that, though.

Link to comment
https://linustechtips.com/topic/1586574-r720-tesla-card/#findComment-16563919
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×