Jump to content

Machine Learning Build

Hello All,

 

This is my first post to the forum, so please take it easy on me. I recently bought a 2006 Mac Pro (bought it for a $1). The unit has a bad motherboard in it, so most of the components are either too old or not worth reusing. It did however have 2 Xeon processors inside, I plan on using for a later project. The parts I plan in reusing are the case itself, the power supply 980 Watts, and the fans. The rest is junk

 

Here's my idea:

I want to build a Machine Learning Box that I can use for my Master program. I want to use it to train models, be used in some neural net applications (training/testing), things like that. I have been looking all over the internet, for tips and guides on what makes the best bang for your buck machine. I have used some gaming builds as a starting point, then tweaking to my use case. The machine will run Ubuntu 14.04 or 15.04 primarily with possible dual boot support for Windows.

 

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

 

Build Requirements:

 

Intel Chip: i7 or Xeon (Based on the total budget for the project I believe i7 is the only option)

 

NVIDIA GPU(s): For the initial build phase I will buy 1 GPU, with the intention of buying a second down the road. Thus SLI is important. Plus the developer support with DIGITS, CUDA cores, etc are important.

 

Supporting Motherboard: Ideally having a motherboard that would drop into the Mac Pro case would be AWESOME! But I am almost certain some mods to the case will have to be made. Also, having fast I/O, PCI expandability, and M.2 support are feature requests.

 

 

Somethings I found that should be important for this build include:

 PCI Express Lanes

    -> I believe that 40 lanes is a requirement because I plan on having 2 GPU's (Not sure yet which NVIDIA card I want yet), and I would like the lanes to be divided equally between the 2 GPU's (16x16). That way I avoid any possible bottlenecks.

 

 RAM

   -> From what I have read for this use case, it is best to have enough RAM (memory) to hold entire datasets. My plan is to start with 32 GB initially, then move up to 64 GB and a step 2, then possible to 128 GB as a step 3. So expanding RAM is important. At the time of writing this I believe that quad-chanel Memory is the way to go, the benefit being speed. What do you guys think?

 

M.2

   -> I want the system to be extremely fast. I would like to use a M.2 solution for holding the OS (Ubuntu), and OS related file, and applications. All the datasets will be held in Secondary Storage (Western Digital Red Hard Drive)

 

 

Question I have:

 

1) When deciding in a CPU, should I go with more cores and a slower clock? Or fewer cores and a higher clock? (Based on this use case)

 

2) Does my assumptions for PCI Express Lanes hold true? Would there be a bottleneck if 1 GPU is getting 16 lanes and the other is getting 8?

 

3) Is there any guides that you guys know of that I could follow? 

 

4) What NVIDIA GPU's have the best SLI support? Or which GPU have been found to work best in this use case? (I know Titan X is great, but please keep the budget in mind)

 

5) Do you guys have any suggestions on the build?

 

Summary

 

I have made 2 different PC Part Picker lists that can be found at:

https://pcpartpicker.com/user/sprichard/saved/#view=8TXRBm

They are listed under the "Mac Pro $1000 Build" and "Mac Pro $1500 Build"

 

As you can see from the link I have build for $1000 budget and $1500 budget. I am working on a $2000 budget build. I would say $2000 is the limit for this build. Let me know what you guys think! I am open to all suggestions as i have not purchased anything, and I am still trying to get my head wrapped around all that I need to know to make a badass Mac Pro ML Box. 

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, SwiftySteve said:

Supporting Motherboard: Ideally having a motherboard that would drop into the Mac Pro case would be AWESOME! But I am almost certain some mods to the case will have to be made. Also, having fast I/O, PCI expandability, and M.2 support are feature requests.

Sorry I cant help out on the deep learning shopping list (aka pc porn)

But I know for a fact that you will spend a minimum of 20+ hours making a standard mobo fit into a mac pro case AND make it look somewhat half presentable. Look up guides for it. The entire back panel has to be cut out and shit.

Also, the fans wont be really re-usable since they do not use standard plugs, but they plug directly on the mobo without cables and slot into places on the case you would have to modify anyway.

The PSU may also not be an option since It will not have PCIE connectors at all. 2006 Mac Pros (and later models too I think) actually had a power connector on the MOBO that adapts to a 6pin PCIE power, not directly from the PSU.

When in doubt, re-format.

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, SwiftySteve said:

It did however have 2 Xeon processors inside, I plan on using for a later project. The parts I plan in reusing are the case itself, the power supply 980 Watts, and the fans. The rest is junk

15 minutes ago, SwiftySteve said:

Intel Chip: i7 or Xeon (Based on the total budget for the project I believe i7 is the only option)

 

NVIDIA GPU(s): For the initial build phase I will buy 1 GPU, with the intention of buying a second down the road. Thus SLI is important. Plus the developer support with DIGITS, CUDA cores, etc are important.

 

Supporting Motherboard: Ideally having a motherboard that would drop into the Mac Pro case would be AWESOME! But I am almost certain some mods to the case will have to be made. Also, having fast I/O, PCI expandability, and M.2 support are feature requests.

 

 

Somethings I found that should be important for this build include:

 PCI Express Lanes

    -> I believe that 40 lanes is a requirement because I plan on having 2 GPU's (Not sure yet which NVIDIA card I want yet), and I would like the lanes to be divided equally between the 2 GPU's (16x16). That way I avoid any possible bottlenecks.

 

 RAM

   -> From what I have read for this use case, it is best to have enough RAM (memory) to hold entire datasets. My plan is to start with 32 GB initially, then move up to 64 GB and a step 2, then possible to 128 GB as a step 3. So expanding RAM is important. At the time of writing this I believe that quad-chanel Memory is the way to go, the benefit being speed. What do you guys think?

 

M.2

   -> I want the system to be extremely fast. I would like to use a M.2 solution for holding the OS (Ubuntu), and OS related file, and applications. All the datasets will be held in Secondary Storage (Western Digital Red Hard Drive)

 

I have an idea that might save you money. From my experience, neural nets are more about parallelism than high speed. It might be worthwhile (and cost effective) to try to go ahead an use those old xeons in a dual processor configuration. Finding a motherboard that supports dual gpus and will run those processors will be difficult. What processor model are they?
 

If your data sets will fit in RAM, you might want to forgo the HDD for holding datasets and go ahead and store atleast the working set on the M.2 device. This would really speed up accessing the data until you get enough ram to fully store the data sets. 

You won't need SLI for what you're doing. SLI is used when you want two GPUs to "alternate frames". With CUDA, you can give each GPU it's own "thread". SLI isn't necessarily the most important thing here, although you could use it if you wanted to. 

I think the salient point here is that you want to follow this design rule: Favor highly parallel processing to high speed processing, and use the fastest volatile and non-volatile memory solutions that you can afford. If you use the two xeons and find a used motherboard that will work (which I can help with if I know the model of xeons) then you can really spend ALOT of money on the GPUs and memory components of the build.

ENCRYPTION IS NOT A CRIME

Link to comment
Share on other sites

Link to post
Share on other sites

I don't know much about machine learning but I'll try to answer the questions I can.

 

1. It is usually better to go with fewer cores and a higher clock, just don't go below a quad-core. Some applications can only utilize one thread at a time so the extra clock speed will do you a lot more good than 2 extra cores. However I do not know much about machine learning so more cores and a slower clock might be worth it, I would do some research.

 

2. Your GPU will probably be fine, many people SLI graphics cards with consumer boards (Z170) which works just fine, SLI will always have some efficiency loss however so don't count on 2x performance, more like 1.6-1.8x.

 

3. If you want to SLI, don't buy a 1060, it is not supported. Keep in mind that a single powerful GPU is always better than two weaker GPUs so get the best single GPU you can afford now (you can still SLI later if you want to). Also, wait for another couple days or so to see how AMD's new graphics card performs. Might be worth a buy and worse case scenario will at least cause Nvidea prices to drop a little.

Main PC: i5 4590 @ 3.5 GHz ♦ RX 480 Armor OC ♦ 16 GB DDR3 ♦ GA-Z97-HD3 ♦ 120 GB 840 EVO ♦ 120 GB Intel 520 ♦ W10 Home

Scrapyard PC: Xeon X5460 @ 3.8 GHz ♦ HD 7870 ♦ 8 GB DDR2 ♦ GA-P35-DS3L ♦ 80 GB Intel 320 ♦ 160 GB WD Caviar SE ♦ W10 Home

Link to comment
Share on other sites

Link to post
Share on other sites

Why must it be an Intel CPU? You should probably wait to see how Zen turns out, especially for a machine learning build.

 

Also, you don't need SLI. SLI is pretty much just a gaming feature.

 

And I'd suggest you don't try to build in that case. It's WAY more of a hassle than it's worth.

"Do as I say, not as I do."

-Because you actually care if it makes sense.

Link to comment
Share on other sites

Link to post
Share on other sites

@Dash Lambda 

 

Why Intel?

I really don't have a great reason. I would say its a mix of its the Brand Name that I know and trust, I know it will run windows no problem, and it seems like the easiest processor to start with (like other people are using Intel). This decision is based on the fact that this is my first build and since I am still learning whats "good" and whats "bad" I'll cut a corner on the processor for simplicity.

 

SLI

From some of the research I have done on Machine Learning Boxes, they all have multiple GPU's. While I understand I may not see 2x performance increase, it will create a bigger computing pool for the algorithms to pull from. The more CUDA cores I can have available the more parallel computing I can do.

 

Why the Case

I am an Apple Fan boy. I'll be the first to admit that. I hope that doesn't discourage anyone from helping me out...I really like the design of this case, and want something that stands out, that most people haven't done yet. I think it would be cool to have a Mod Mac Pro Case. So bottom line its an ascetic.

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, SwiftySteve said:

Oh god, a Woodcrest Xeon...

Someone correct me if I'm wrong, but I think Intel's modern ULV i7's outperform those.

"Do as I say, not as I do."

-Because you actually care if it makes sense.

Link to comment
Share on other sites

Link to post
Share on other sites

Sooo yea those chips are OLD. Plus I wasn't able to confirm that they actually work. From the file I have on the unit the original machine came into a repair shop for Temp issues. The diagnostics lights were showing that CPU A and CPU B were running hotter then what was allowed basically. So there might actually be a problem with these CPU's. That is why for this build I would just buy an i7

Link to comment
Share on other sites

Link to post
Share on other sites

12 minutes ago, SwiftySteve said:

-snip-

Sorry for the double post, your reply went up while I was writing the other one.

 

Intel's historically been the best choice, but AMD has some pretty big things coming up, and you might be able to get an equivalent to Intel's $1,000 chips for $600-700. I'd really suggest waiting for Zen, just in case it turns out really good.

 

SLI is specifically a technology for rendering graphics. If you notice, most of the machine learning boxes don't have SLI bridges, because multi-GPU is handled differently.
Actually, multi-GPU scales much better for certain compute workloads than it does for games.

 

I don't really see what you see in that case, but if you want to put your time into it, then... Well, it's not really for me to say what's worth it to you.

"Do as I say, not as I do."

-Because you actually care if it makes sense.

Link to comment
Share on other sites

Link to post
Share on other sites

Oh gotcha! I was under the impression that Multiple GPU's would allow for the NVIDIA ML software (DIGITS) to take advantage of the additional resources and train the model faster, etc.

 

If you look at the NVIDIA Dev Box they have 4 Titan X's in it I believe. So I made the assumption that NVIDIA has found a way to make multiple GPU's efficient, and powerful when applied to Machine Learning use cases. 

 

I'll do my research for sure. What's your thoughts on M.2 for Storage in this machine?

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, SwiftySteve said:

Oh gotcha! I was under the impression that Multiple GPU's would allow for the NVIDIA ML software (DIGITS) to take advantage of the additional resources and train the model faster, etc.

 

If you look at the NVIDIA Dev Box they have 4 Titan X's in it I believe. So I made the assumption that NVIDIA has found a way to make multiple GPU's efficient, and powerful when applied to Machine Learning use cases. 

 

I'll do my research for sure. What's your thoughts on M.2 for Storage in this machine?

Just to be clear, I was saying that SLI itself applies only to rendering, but multi-GPU is used for compute.

SLI is a protocol, so it's a specific way in which the GPU's talk with each other, and it's not the only way to utilize multiple GPU's. Machine learning benefits greatly from highly parallel processing, and it usually scales well from throwing more chips at it.

 

As for NVMe, I'd say it's a good idea to use it but I can't comment for certain on your use case. If moving chunks of data to and from secondary storage is important to the workload, as it is for machine learning a lot of the time, then there really is no real substitute for an NVMe SSD in terms of speed.

 

I'd also like to point out the distinction between M.2 and NVMe. M.2 is the physical connector, and it can support SATA as well as NVMe. Make sure that, when you're looking at M.2 SSD's, you're looking at NVMe (often called more generally "PCIe") -That's the type that's insanely fast.

"Do as I say, not as I do."

-Because you actually care if it makes sense.

Link to comment
Share on other sites

Link to post
Share on other sites

In my experience it's better to have a lot of ram and it's  preferable to have fewer faster cores than many slow cores. I think there are gpus ment for machine learning but they are very expensive. 

 

Also I think it's overkill to use an M.2 drive as datasets are usually stored in ram while processing them. 

 

Bare in mind that I'm just an actuary who's hobby is learning about machine learning. 

Link to comment
Share on other sites

Link to post
Share on other sites

@Dash Lambda

Ok, I understand the purpose of SLI now. So lets say I do this build with 1 GPU for now, then 6 month down the road want to add a second. How to I get the system to recognize/use that it can us both GPU's at the same time? I thought SLI was the way, but I guess not. 

 

Does the system know you use both of the GPU's just by plugging it in?

Link to comment
Share on other sites

Link to post
Share on other sites

@CBojorges

 

I think I like M.2 because of the overall speed. Being able to read and write at high speeds would allow the overall system speed to increase. 

 

I'm wondering if I can have the finite control to move Datasets from one storage medium to another. So maybe I can have the system download a dataset to Secondary Memory (Western Digital Hard Drive). While that is happening, I can be using the M.2 Storage medium to store a dataset I am about to use to train a model, then move the dataset to RAM once I am using the dataset to train. Does that make sense?

 

SO basically it goes:

Download Dataset -> HDD -> Load Dataset to M.2 -> Dataset moved to RAM for training -> Train Model

The reason I want the dataset in "Secondary Storage" (HDD) first is because its high capacity and cheap. That would allow for multiple datasets (each dataset being large) to be "warehoused" in "Secondary Memory". Does that make sense?

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×