Jump to content

5 VMs Build Machine in a Rack

In a video game studio, one of the biggest time waster is actually "building" the game.

And if you support many platform (like us, PC, Mac, Linux, PS4, X1, Switch), you need to compile for each one.

Towards the end of our project, building for a single platform will take roughly an hour.

Asking someone to do that on a normal PC isn't an acceptable solution.

 

Bigger studio normally have rows of dedicated computers that detect changed in the versioning depot, pull it, and build it.

 

We are a small studio, so we want to build 1 dedicated machine, but it would run 5 VMs (PC, Linux, PS4, X1, Switch).

This means 2 or 3 cores per VM and 1 GPU per VM, with 1 or 2 core for the bare metal OS. Each VM will have enough SSD and RAM space to build properly.

I'm not too sure yet how the ethernet ports will be shared... but one problem at a time. Maybe a single Ethernet card with 3-4 ports on it to add on top of the motherboard.

Right now, we plan to run Windows Server 2016, with Windows 10 Enterprise in the VMs. (Viva Bizspark for the free software!)

 

Up until now, we were planning on 2 x Xeon setup with 5 x GT 1030 slim form.

But someone pointed that we would most likely have enough core using a threadripper 1950X (16 cores).

 

So the question... Would a Threadripper do the job? Can it virtualize properly and handle all the resources sharing of running 6 OS?

Right now, they are almost 30% off on Newegg, so it looks tempting.

Link to comment
Share on other sites

Link to post
Share on other sites

21 minutes ago, CradleGames said:

This means 2 or 3 cores per VM and 1 GPU per VM, with 1 or 2 core for the bare metal OS

Thats not really how cores work. You overprovision them so vms share cores.

 

21 minutes ago, CradleGames said:

I'm not too sure yet how the ethernet ports will be shared

A virtual switch is made

 

Do you need seprate GPUs? What are you using the gpu's for? Video output? Cuda? Id probably go with a single tesla and split it in software. They have full support for this on the pro grade cards and it shows up a gpu on each system.

 

ID really look into something like a dell r740 with esxi. Its the more common option and have bettter options for gpu splitting. 

 

 

You have a total budget here?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, CradleGames said:

This means 2 or 3 cores per VM and 1 GPU per VM

Do you even need the GPU at all? Wouldn't think compiling the code would require one, though that might depend on what you are using and the process involved but normally a GPU is not required to run a VM as software rendering is easily up to the task.

 

Also it's better to allow VMs to share cores, you don't have to give each VM all of the available threads but in say a 16 core server running 5 VMs giving them each 6 would be a good starting point then assess utilization and build times over a few weeks then adjust as needed i.e. more cores per VM if simultaneous use is not high across the VMs (concurrent builds).

 

2 hours ago, CradleGames said:

I'm not too sure yet how the ethernet ports will be shared... but one problem at a time. Maybe a single Ethernet card with 3-4 ports on it to add on top of the motherboard.

If the build process doesn't use the network, other than to transfer the files to the VM before the build, sharing the NIC between the VMs in fine.

 

2 hours ago, CradleGames said:

So the question... Would a Threadripper do the job?

Yes it would work, much cheaper than Xeons and in your case and the way the system is being used Xeons are not actually a requirement so I wouldn't pay the premium for them.

 

It may even be better to build 2 systems and spec them to be able to run 3 VMs each as at least that way if one fails you can still run builds without interrupting workflow by taking a workstation offline to do the work.

 

Windows Hyper-V Server 2016 is free so it wouldn't add to the OS licensing costs.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, leadeater said:

Do you even need the GPU at all? Wouldn't think compiling the code would require one, though that might depend on what you are using and the process involved but normally a GPU is not required to run a VM as software rendering is easily up to the task.

Yes, sadly compiling shaders is still a GPU only operation. I've checked, I need a dedicated GPU, no matter how slow or useless it is.

 

5 hours ago, leadeater said:

Yes it would work, much cheaper than Xeons and in your case and the way the system is being used Xeons are not actually a requirement so I wouldn't pay the premium for them.

What is even the point of Xeon then? I was sure virtualization and parallelism was their thing.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Electronics Wizardy said:

Thats not really how cores work. You overprovision them so vms share cores.

 

A virtual switch is made

 

Do you need seprate GPUs? What are you using the gpu's for? Video output? Cuda? Id probably go with a single tesla and split it in software. They have full support for this on the pro grade cards and it shows up a gpu on each system.

 

ID really look into something like a dell r740 with esxi. Its the more common option and have bettter options for gpu splitting. 

 

You have a total budget here?

The GPUs are for compiling shaders. Sadly, they are not something I can avoid.

 

As for total budget... don't really have one, except I should try to have it as low as possible. Probably in the 2.5-3k range.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, CradleGames said:

What is even the point of Xeon then? I was sure virtualization and parallelism was their thing.

Xeons allow ECC support if you need long term reliability for an application, you're only run short tasks comparatively speaking. They can also allow other things like Registered RAM support which means going over 64GB ram for a CPU as well as supporting 2 or more CPUs in a single system.

 

What you're doing is more just a high end workstation dedicated to software builds/compiling so a lot of what Xeons bring to the table isn't needed so unless there is a minimal cost difference I wouldn't pay extra for it.

 

A Xeon and a regular desktop CPU are fundamentally the same, a i7/i9 with 10 cores is just as good at virtualization as a 10 core Xeon up until you need more than 64GB ram.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, leadeater said:

Xeons allow ECC support if you need long term reliability for an application, you're only run short tasks comparatively speaking. They can also allow other things like Registered RAM support which means going over 64GB ram for a CPU as well as supporting 2 or more CPUs in a single system.

 

What you're doing is more just a high end workstation dedicated to software builds/compiling so a lot of what Xeons bring to the table isn't needed so unless there is a minimal cost difference I wouldn't pay extra for it.

 

A Xeon and a regular desktop CPU are fundamentally the same, a i7/i9 with 10 cores is just as good at virtualization as a 10 core Xeon up until you need more than 64GB ram.

And I don't need lot of RAM, and don't need ECC, just brute force. Got it.

 

I assume the 1950X is at 999$ (almost same price as the 1920X) because the new chips are coming out soon.

Link to comment
Share on other sites

Link to post
Share on other sites

8 minutes ago, CradleGames said:

And I don't need lot of RAM, and don't need ECC, just brute force. Got it.

 

I assume the 1950X is at 999$ (almost same price as the 1920X) because the new chips are coming out soon.

Yep so Threadripper CPUs should be cheap for now, plus may fit better with your needs since they have 64 PCIe lanes meaning you may have better motherboard options to fit the required GPUs at a decent PCIe lane count. Intel only has 40/44 lanes per CPU depending on generation (high end platforms only) which means you'd struggle to get 5 GPUs connected at x8.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, CradleGames said:

The GPUs are for compiling shaders. Sadly, they are not something I can avoid.

 

As for total budget... don't really have one, except I should try to have it as low as possible. Probably in the 2.5-3k range.

Id go dell r730 then. Get a used one. STill faster than threadripper, more ram.

 

Reason why you want hyper-v? Id probably go esxi here.

 

For gpu id go with a single tesla as you can split it in software. That removes the need for so many slots of good iommu groups. and gpu passthough isn't really supported on the gtx cards offically, so its always a bit hacky.

 

1 hour ago, leadeater said:

Yep so Threadripper CPUs should be cheap for now, plus may fit better with your needs since they have 64 PCIe lanes meaning you may have better motherboard options to fit the required GPUs at a decent PCIe lane count. Intel only has 40/44 lanes per CPU depending on generation (high end platforms only) which means you'd struggle to get 5 GPUs connected at x8.

Problem is gpu passthrough can still be a pain on these, so go single gpu and split with esxi/xen.

 

2 hours ago, CradleGames said:

I need a dedicated GPU

What do you need in this gpu?

 

Opencl? Cuda? You can simulate those in software if you don't need much power.

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, Electronics Wizardy said:

Problem is gpu passthrough can still be a pain on these, so go single gpu and split with esxi/xen.

Hyper-V is actually really good about it, unlike ESXi/KVM, and you can even share a GPU between VMs.

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, leadeater said:

Hyper-V is actually really good about it, unlike ESXi/KVM, and you can even share a GPU between VMs.

Have then added vgpu support to hyper-v. Last time I looked it was only in esxi and xen. That way you can have full cuda support in vms and it works like any other gpu.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Electronics Wizardy said:

Have then added vgpu support to hyper-v. Last time I looked it was only in esxi and xen. That way you can have full cuda support in vms and it works like any other gpu.

They had it for a while but it was only for GPU sharing to start with, they added full PCIe passthrough in Server 2016. The sharing actually works rather well and doesn't need Teslas or anything, just DX11 or DX12 depending on OS version.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, leadeater said:

They had it for a while but it was only for GPU sharing to start with, they added full PCIe passthrough in Server 2016. The sharing actually works rather well and doesn't need Teslas or anything, just DX11 or DX12 depending on OS version.

yea they had remotefx for a while, and added gpu passthrough, but do they have nvidia vgpu with the tesla and grid gpus?

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, leadeater said:

Hyper-V is actually really good about it, unlike ESXi/KVM, and you can even share a GPU between VMs.

Ok... You mentionned a few times "sharing", but I think it's a bad idea.

When a new build is requested, all VMs are doing the job for a specific platform in a simultaneous matter. If I start sharing (NIC or GPU) then it's only 1 VM at a time, which kinda defeat the purpose. When a VM is using the GPU, what happen to other VM who would need it too?

 

Another reason for the existence of VM, is that the game library is cached per platform. Switching from one platform to another isn't a valid solution, as it takes hours to do that. (recompiling all the texture, sound, meshes for platform specific format). 

 

So the steps are...

 

- VMs detect a change on the depot that is flagged to be built.

- Each VM download the changes from the depot.

- Each VM compile changed cached assets.

- Building is initiated for all platforms. This switch the code base to the selected build type (Debug, Production, Release) and platform specific pre-processing flags.

- Code is recompiled.

- Shaders are compiled by GPU.

- Assets are bundled.

- Software is packaged.

- VMs take the completed package and push it back to a shared drive for retrieval by users.

- Rebuild if it was requested with many different build type.

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, CradleGames said:

k... You mentionned a few times "sharing", but I think it's a bad idea.

When a new build is requested, all VMs are doing the job for a specific platform in a simultaneous matter. If I start sharing (NIC or GPU) then it's only 1 VM at a time, which kinda defeat the purpose. When a VM is using the GPU, what happen to other VM who would need it too?

Thats the whole point of vms, share resources so you can use them better.

 

What are the gpus needed for anyways? cuda? opencl? Have you tested without a gpu?

 

The way the gpu is shared is so that each vm sees the gpu as its own gpu, they can all use it at once and all vms get a bit of the main gpu with vgpu. 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, Electronics Wizardy said:

Thats the whole point of vms, share resources so you can use them better.

 

What are the gpus needed for anyways? cuda? opencl? Have you tested without a gpu?

 

The way the gpu is shared is so that each vm sees the gpu as its own gpu, they can all use it at once and all vms get a bit of the main gpu with vgpu. 

 

 

Shader compilation. I said it already. It's a limitation I cannot avoid. 
So if one VM is compiling shaders, another can do it too at the same time?

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, CradleGames said:

Shader compilation. I said it already. It's a limitation I cannot avoid. 
So if one VM is compiling shaders, another can do it too at the same time?

what api are you using for this?

 

Have you tried this is a virtual gpu solution like in esxi or hyper-v

 

yes, a vgpu setup lets you do this on multiple vms at once with one gpu.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Electronics Wizardy said:

what api are you using for this?

 

Have you tried this is a virtual gpu solution like in esxi or hyper-v

 

yes, a vgpu setup lets you do this on multiple vms at once with one gpu.

No clue about the API. When you use an engine like Unreal or Unity... they kinda do their own thing.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, CradleGames said:

No clue about the API. When you use an engine like Unreal or Unity... they kinda do their own thing.

well that would be great to know.

 

Go test this.

 

Fire up a hyper-v vm on your system(enable it in windows features). Install windows 10 + your program and choice and see if it will compile the shaders for you. and try it with and without remotefx enabled in hyper-v

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, CradleGames said:

Ok... You mentionned a few times "sharing", but I think it's a bad idea.

When a new build is requested, all VMs are doing the job for a specific platform in a simultaneous matter. If I start sharing (NIC or GPU) then it's only 1 VM at a time, which kinda defeat the purpose. When a VM is using the GPU, what happen to other VM who would need it too?

 

Another reason for the existence of VM, is that the game library is cached per platform. Switching from one platform to another isn't a valid solution, as it takes hours to do that. (recompiling all the texture, sound, meshes for platform specific format). 

 

So the steps are...

 

- VMs detect a change on the depot that is flagged to be built.

- Each VM download the changes from the depot.

- Each VM compile changed cached assets.

- Building is initiated for all platforms. This switch the code base to the selected build type (Debug, Production, Release) and platform specific pre-processing flags.

- Code is recompiled.

- Shaders are compiled by GPU.

- Assets are bundled.

- Software is packaged.

- VMs take the completed package and push it back to a shared drive for retrieval by users.

- Rebuild if it was requested with many different build type.

 

You may find you won't need that much network resources since the files are downloaded to the VM and then network access after that point isn't required for the build. You can use two 1Gb NICs in a team with load balancing so there is a total of 2Gb for the VMs to use, maximum of 1Gb on a single VM.

 

How big are the files? That will be the biggest consideration as to how you should design the networking for this i.e. 1Gb, 2x 1Gb or 10Gb. Where are the files kept? Does that location have more than 1Gb? If not you'll be limited there and it won't matter what you do on the build server side of things.

 

5 hours ago, CradleGames said:

Ok... You mentionned a few times "sharing", but I think it's a bad idea.

More just mentioning it's possible, it may or may not work for you. I've used it before to actually play games on a Hyper-V shared GPU but that doesn't mean it'll work for you or be the best way to do it.

 

5 cheap dedicated GPUs directly passed through to the VM may indeed be the best solution or 1 GTX 1060 6GB might be enough shared between the VMs.

 

This can all be tested if you have a Windows 10 computer you can test on, as @Electronics Wizardy mentioned just enable the Hyper-V role and allow Hyper-V to use the physical GPU then add vGPU to a VM and run a test build.

https://www.altaro.com/hyper-v/remotefx-windows-10-client-hyper-v/

 

My main concern around sharing the GPU and using that Hyper-V feature is the potential for the Shaders to not be compiled correctly, this is even something you must check even with 5 dedicated GPUs passed through to VMs. GPU passthrough is very near like a completely dedicated computer/OS but not actually and it's not always perfect.

 

I wouldn't buy anything until you test this with a single VM using the proposed setup on a computer you have already or on a much cheaper build with reusable parts if it does work.

 

As for sharing he CPU cores that's extremely common and applications, even code compiling, rarely uses all cores to their full extent all the time. At work we share the physical cores to virtual cores at 4:1 so for a ESXi host with 56 cores and an average of 4 vCPUs per VM that host would have 14 VMs running on it. It is easier and more effective to do resource sharing on a larger system than a smaller system.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, CradleGames said:

No clue about the API. When you use an engine like Unreal or Unity... they kinda do their own thing.

99% of the time you don't need a GPU to do any compiling. if you complie it on a nVidia GPU, then you'll need to re-do it on every intel, and AMD GPU. then again for every generation.

 

Just because your compiling on the CPU doesn't mean the machine code wont tell the system to perform the task on the CPU, your code tells it to run on the GPU, so that's what it'll do. Now if your planning on testing the build on the same system, that's a different matter.

 

Also, consider that most GPU's are different, are you going to purchase one of every GPU that any potential end user is going to run?so you can compile everything single shader possible? why not just let OpenGL/DirectX handle that?

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Blake said:

99% of the time you don't need a GPU to do any compiling. if you complie it on a nVidia GPU, then you'll need to re-do it on every intel, and AMD GPU. then again for every generation.

 

Just because your compiling on the CPU doesn't mean the machine code wont tell the system to perform the task on the CPU, your code tells it to run on the GPU, so that's what it'll do. Now if your planning on testing the build on the same system, that's a different matter.

 

Also, consider that most GPU's are different, are you going to purchase one of every GPU that any potential end user is going to run?so you can compile everything single shader possible? why not just let OpenGL/DirectX handle that?

Huh... Shaders are not hardware dependant on that sense. If they are compiled on OpenGL 2.0, they will run on all hardware supporting it, unlike compiling native Intel or native AMD.

 

Beside, it's not a choice I have. To compile on Unreal or Unity, I need a GPU. It doesn't build otherwise.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, Electronics Wizardy said:

yea they had remotefx for a while, and added gpu passthrough, but do they have nvidia vgpu with the tesla and grid gpus?

They do yes, not sure how it works but it's not really needed with Hyper-V as it's an inbuilt feature and can handle it itself. ESXi needs the drivers to expose multiple virtual GPUs as it's not a native feature. The Nvidia/Microsoft reference design still states RemoteFX is used.

 

http://images.nvidia.com/content/grid/pdf/microsoft-server-solution.pdf

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, leadeater said:

They do yes, not sure how it works but it's not really needed with Hyper-V as it's an inbuilt feature and can handle it itself. ESXi needs the drivers to expose multiple virtual GPUs as it's not a native feature. The Nvidia/Microsoft reference design still states RemoteFX is used.

 

http://images.nvidia.com/content/grid/pdf/microsoft-server-solution.pdf

that design uses remote fx, the problem with remote fx is you can't use things like cuda and opencl, that vgpu lets you use.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Electronics Wizardy said:

that design uses remote fx, the problem with remote fx is you can't use things like cuda and opencl, that vgpu lets you use.

That's why these are all targeted at VDI use cases for general higher end workstation replacements.

 

Quote

Leverage NVIDIA GRID to accelerate virtual desktops and apps running on Microsoft Hyper-V, supporting shared GPU acceleration using a Hyper-V-based guest graphics driver. For higher performance, Windows Server 2016 supports NVIDIA GPU pass-through with Discrete Device Assignment (DDA).

For that setup again you'd have to spawn multiple instances of the GPU and do full passthrough rather than Hyper-V shared GPU.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×