Jump to content

Hi all,

 

So, I am a graduate student in a machine learning lab. We currently have a dedicated GPU server with 8 GeForce Titan X cards (I believe the Maxwell chipset) and have been looking at getting an additional server with some new GPUs. I am the main go-to person for my PI about this stuff, as I know the most about hardware. Personally, I would love to get some of the new Titan V cards, but my PI got a quote for a setup with some Telsa P40 cards. The P40 cards have 24 GB of GDDR5 (I believe) at 346 GB/s memory bandwidth. The Titan V's have only 12 GB of graphics memory but at a bandwidth of 652.8 GB/s. My question is this: for machine learning purposes, memory is a crucial aspect of it so would these be considered equal? How does the amount of memory and memory bandwidth compare? Is half the memory but having it twice as fast equivalent for our purposes? I am usually fairly knowledge about this sort of hardware stuff, but I have no idea.

CPUAMD 3800x; GPUASUS TUF RTX 3070; Motherboard: Asus Prime x570-Pro;

CPU Coolerbe quiet! Dark Rock Pro 3; RAMG.SKILL Ripjaws V 32 GB DDR4 @ 3600 MHz;  

Case: NZXT H440; PSU: EVGA SuperNOVA G2 750W; Storage: Intel 600P SSD 512GB, Segate Barracuda 2TB HDD @ 7200RPM

Link to comment
https://linustechtips.com/topic/914957-gpu-memory-amount-vs-memory-speed/
Share on other sites

Link to post
Share on other sites

2 minutes ago, Krolic said:

My question is this: for machine learning purposes, memory is a crucial aspect of it so would these be considered equal? How does the amount of memory and memory bandwidth compare? Is half the memory but having it twice as fast equivalent for our purposes? I am usually fairly knowledge about this sort of hardware stuff, but I have no idea.

While I don't know what neural networks do with memory, I can say this: GPU memory bandwidth is not a substitute for video card memory capacity. The thing is, when you run out of VRAM on a GPU, it has to start swapping not-used-as-often data to somewhere else, usually system RAM. The thing is, it can only do this over the PCIe interface, which at PCIe 3.0 x16 is 15.6GB/s. And when memory has to swap, anything the GPU was doing has to stall so overall performance goes down.

 

So figure out what your application needs. If it eats up a lot of memory, you need more capacity over bandwidth.

Link to post
Share on other sites

This is highly dependent on the algorithms and datasets you use. There is no answer to this question. The bottom line is: where is the bottleneck? Do you run out of RAM when loading assets? If not then capacity is not the issue. Are your cores skipping cycles while waiting to load data from the RAM to the cache? If not then bandwidth is not the issue. Is your GPU skipping cycles while it waits for instructions from the CPU? Then your CPU is the limitation.

 

There is not good way to answer this without either running diagnostic codes during a run or just trying different configs and seeing what matters most.

 

You might also want to see if your algorithms can use mixed precision (i.e. tensor cores) which would give you a >10x improvement in FLOPS over traditional comp methods. If you can use that tech and can implement it efficiently you might want to consider Volta-based cards (Tesla V100 or Titan V) for much faster training/execution.

Primary PC-

CPU: Intel i7-6800k @ 4.2-4.4Ghz   CPU COOLER: Bequiet Dark Rock Pro 4   MOBO: MSI X99A SLI Plus   RAM: 32GB Corsair Vengeance LPX quad-channel DDR4-2800  GPU: EVGA GTX 1080 SC2 iCX   PSU: Corsair RM1000i   CASE: Corsair 750D Obsidian   SSDs: 500GB Samsung 960 Evo + 256GB Samsung 850 Pro   HDDs: Toshiba 3TB + Seagate 1TB   Monitors: Acer Predator XB271HUC 27" 2560x1440 (165Hz G-Sync)  +  LG 29UM57 29" 2560x1080   OS: Windows 10 Pro

Album

Other Systems:

Spoiler

Home HTPC/NAS-

CPU: AMD FX-8320 @ 4.4Ghz  MOBO: Gigabyte 990FXA-UD3   RAM: 16GB dual-channel DDR3-1600  GPU: Gigabyte GTX 760 OC   PSU: Rosewill 750W   CASE: Antec Gaming One   SSD: 120GB PNY CS1311   HDDs: WD Red 3TB + WD 320GB   Monitor: Samsung SyncMaster 2693HM 26" 1920x1200 -or- Steam Link to Vizio M43C1 43" 4K TV  OS: Windows 10 Pro

 

Offsite NAS/VM Server-

CPU: 2x Xeon E5645 (12-core)  Model: Dell PowerEdge T610  RAM: 16GB DDR3-1333  PSUs: 2x 570W  SSDs: 8GB Kingston Boot FD + 32GB Sandisk Cache SSD   HDDs: WD Red 4TB + Seagate 2TB + Seagate 320GB   OS: FreeNAS 11+

 

Laptop-

CPU: Intel i7-3520M   Model: Dell Latitude E6530   RAM: 8GB dual-channel DDR3-1600  GPU: Nvidia NVS 5200M   SSD: 240GB TeamGroup L5   HDD: WD Black 320GB   Monitor: Samsung SyncMaster 2693HM 26" 1920x1200   OS: Windows 10 Pro

Having issues with a Corsair AIO? Possible fix here:

Spoiler

Are you getting weird fan behavior, speed fluctuations, and/or other issues with Link?

Are you running AIDA64, HWinfo, CAM, or HWmonitor? (ASUS suite & other monitoring software often have the same issue.)

Corsair Link has problems with some monitoring software so you may have to change some settings to get them to work smoothly.

-For AIDA64: First make sure you have the newest update installed, then, go to Preferences>Stability and make sure the "Corsair Link sensor support" box is checked and make sure the "Asetek LC sensor support" box is UNchecked.

-For HWinfo: manually disable all monitoring of the AIO sensors/components.

-For others: Disable any monitoring of Corsair AIO sensors.

That should fix the fan issue for some Corsair AIOs (H80i GT/v2, H110i GTX/H115i, H100i GTX and others made by Asetek). The problem is bad coding in Link that fights for AIO control with other programs. You can test if this worked by setting the fan speed in Link to 100%, if it doesn't fluctuate you are set and can change the curve to whatever. If that doesn't work or you're still having other issues then you probably still have a monitoring software interfering with the AIO/Link communications, find what it is and disable it.

Link to post
Share on other sites

Depends on workload.  Bandwidth is more important in some cases and capacity is more important in others.

 

Capacity isnt everything @M.Yurizaki, There were GT1030 class cards back in the day that came with 4gb+ of VRAM (GT430 DDR3).  The memory bus was not capable of even transferring enough data to use the memory fast enough to work faster than a CPU for CUDA work.

Link to post
Share on other sites

Just now, KarathKasun said:

Capacity isnt everything @M.Yurizaki, There were GT1030 class cards back in the day that came with 4gb+ of VRAM (GT430 DDR3).  The memory bus was not capable of even transferring enough data to use the memory fast enough to work faster than a CPU for CUDA work.

If you're going that slow, then of course. But for the purposes of OP's question, if the application eats up memory, then it's better to have capacity over bandwidth to avoid running out and needing to swap, even if the task could benefit from more bandwidth.

 

That's assuming of course the application doesn't have an option to scale how much RAM it can consume.

Link to post
Share on other sites

Ok, thanks for the input. The stuff we are dealing with are massive image files (like 3D volumes for CT and MRI scans). More memory would mean we can dumb more info per batch for training. But the points about the PCIe bandwidths and other points are valid. The points about bottle necks are probably the most valid, where are they. I think the increase in memory would mean better performance, in the case where all other variables are held constant. I think the best thing I should look at are some bench marks between the Titan V and the Tesla P40. Personally, I would love the Quadro GV100 as it is volta architecture, meaning tensor cores and therefore mixed precision, but also 32 GB per card. The downside is they are hella expensive. 

CPUAMD 3800x; GPUASUS TUF RTX 3070; Motherboard: Asus Prime x570-Pro;

CPU Coolerbe quiet! Dark Rock Pro 3; RAMG.SKILL Ripjaws V 32 GB DDR4 @ 3600 MHz;  

Case: NZXT H440; PSU: EVGA SuperNOVA G2 750W; Storage: Intel 600P SSD 512GB, Segate Barracuda 2TB HDD @ 7200RPM

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×