Jump to content

Nvidia GPU Virtualization Hacked For GeForce Cards

Uttamattamakin

Abstract

Nvidia limits using one GPU as one or more virtual GPU's to its enterprise level products.  The ability exist in the silicon for GeForce but it is not enabled in the drivers.  Similar to how GeForce drivers would give error 43 when passed through in a VM.  However, we are not talking about passthrough but something similar to SR-IOV.  This allows one to use one GPU to run two separate operating systems by using hardware in the GPU to virtualize  a GPU.  That said logically there would be some limits to this from the nature of virtualization.   This is a hack so one must be comfortable with downloading code from github, patching kernel modules, etc.  That said, anyone who would really need this ability should be able to do this. 

 

Quotes

Quote

GPU virtualization, which allows more than one user to use a GPU simultaneously, is one of the differentiators between GPUs for data centers and those designed for consumer PCs. Nowadays, many workstations and even high-end desktops are located remotely so the users can share the GPUs. Modern hardware is so powerful that its performance is sometimes excessive for one user, so sharing one graphics card between multiple users makes sense.  --Anton Shilov,  Toms Hardware 

 

My thoughts

My first response is hallelujah! Now if I can get my hands on a compatible GPU then I can accomplish all of my work and game on my desktop computer.  If I can get my hands on one I could run CUDA code in Mathematica or Matlab or Python modeling various theoretical physics situations and kill the time by gaming.  I could run Windows for the use of Microsoft Office for presenting power point content to my class in the best possible way, then capture that window and send it out over Zoom or Blackboard Collaborate Ultra.  Not to mention using windows and Linux based tools at the same time for all purposes.  Then finally keeping everything stored redundantly on my raid array, and backed up with both online NAS and via external/removeable drives.  

 

There are a few problems though.  Number one is the fact that to use this one needs.

 

Quote

This script will only work if there exists a vGPU compatible Tesla GPU that uses the same physical chip as the actual GPU being used. --DualCoder, Github

and

Quote

For now, GP102, GP104, TU102, TU104, and GA102 GPUs are supported, and the capability works on Linux and with KVM virtual machine software.  -- Anton Shilov, Toms Hardware

In terms of GPUs one would buy that would mean having according to VideoCardz:

Any* 10 series card 1060 or better.

Any* 20 series card 2070 Super or better. 

Any 30 series card 3080 or better. 

 

*Any such card with enough VRAM for this to make sense and enough horsepower to be more than a tech demo.  For example right now one can run CDUA code in WSL2.0 using Ubuntu for WSL 2 as a beta.  However, the performance is not great compare to bare metal, not even half.   

 

If one does not have two GPU's (or an IGP or APU processor) anyway then using one 1080Ti as the display out for Linux and also to virtualize a GPU for Windows would be very limited.   Basically using VGPU does not unlock more power.  Instead it would reduce the power by at least half.    Half of the VRam, half of the CUDA cores, half of everything.   So  a GTX 1080Ti or even 1080 that can hold its own in games in terms of rasterization performance would be as weak as a 980.  

 

For a card that is less than a 1080 this might not even be very functional for gaming.  Perhaps using it to virtualize Windows to run office apps alone this will work.  It would be like having a GT level card for your VM. 

 

All of that said, while I can't risk it on my working computer during the school term I may try this out during the summer. If so then this would get me to computing nirvana. Since I have and will have a Ryzen APU to run my Linux desktop.  Thus the overall effect of virtualizing a Windows instance on my 1080 might be no different than running a game on it while also running a computation.   The only other way I could do this would be to build a new computer with enough expansion slots to get two dGPU's in addition to my APU.  So this will wave me from a lot of headache.  hallelujah, hallelujah, hallleeeeluuujah!.  

However this would not be for everyone.   

 

 

Sources

 Nvidia's Virtualization Unlocked On Gaming GPUs via Hack | Tom's Hardware

GitHub - DualCoder/vgpu_unlock: Unlock vGPU functionality for consumer grade GPUs. 

Getting started with CUDA on Ubuntu on WSL 2 | Ubuntu

Link to comment
Share on other sites

Link to post
Share on other sites

This is absolutely awesome!

I definitely can't use this with the current performance limitations but maybe we will see some improvements, especially when it's now at least possible to virtualize it.

Link to comment
Share on other sites

Link to post
Share on other sites

42 minutes ago, WereCat said:

This is absolutely awesome!

I definitely can't use this with the current performance limitations but maybe we will see some improvements, especially when it's now at least possible to virtualize it.

This more or less disproves the idea that it is "not possible" with a GE Force GPU.  The question is the inevitable lower performance worth it.  I would consider it only with a 1080 or better since a 980 level of performance is respectable for killing time while something else is rendering or whatever in Linux.   Once one has a 3080 getting a 2070 or 1080Ti level of performance for Windows .... but with Tensor cores so yeah.  

What Nvidia should do is make an official version of this and allow people to virtual ONE card, all legal, licensed, and supported.  At least on their top tier cards which would have the kick to make it really worth while. 1070+ , 2070 + and 3070 + or some such. 

Link to comment
Share on other sites

Link to post
Share on other sites

I get to slam a 980 into my VM? Hell yah let's goo ooooooooo! 😄

 

This sounds unreal, I am incredibly happy, this solves so many problems ❤️

 

I spent $2500 on building my PC and all i do with it is play no games atm & watch anime at 1080p(finally) watch YT and write essays...  nothing, it just sits there collecting dust...

Builds:

The Toaster Project! Northern Bee!

 

The original LAN PC build log! (Old, dead and replaced by The Toaster Project & 5.0)

Spoiler

"Here is some advice that might have gotten lost somewhere along the way in your life. 

 

#1. Treat others as you would like to be treated.

#2. It's best to keep your mouth shut; and appear to be stupid, rather than open it and remove all doubt.

#3. There is nothing "wrong" with being wrong. Learning from a mistake can be more valuable than not making one in the first place.

 

Follow these simple rules in life, and I promise you, things magically get easier. " - MageTank 31-10-2016

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Uttamattamakin said:

For now, GP102, GP104, TU102, TU104, and GA102 GPUs are supported, and the capability works on Linux and with KVM virtual machine software.  -- Anton Shilov, Toms Hardware

Crying in TU116

A PC Enthusiast since 2011
AMD Ryzen 7 5700X@4.65GHz | GIGABYTE GTX 1660 GAMING OC @ Core 2085MHz Memory 5000MHz
Cinebench R23: 15669cb | Unigine Superposition 1080p Extreme: 3566
Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Vishera said:

Crying in TU116

Yeah.... it makes no sense since it has got to be a pretty powerful GPU. I mean if theoretically a 1060 might be able to do it. 


I understand binning and some parts having parts cut out of them... but that was an interesting choice to cut out. 

Link to comment
Share on other sites

Link to post
Share on other sites

please virtualize everything, put them in containers and block the canal.

Here we go boiisss.

 

But could we have an hacked geforce program? so we don't need to sign up anymore? That would be great, thanks.

Link to comment
Share on other sites

Link to post
Share on other sites

I wonder if this would work with a 650ti in the system. It for the display, GTX 1070 entirely for virtualization.

"We also blind small animals with cosmetics.
We do not sell cosmetics. We just blind animals."

 

"Please don't mistake us for Equifax. Those fuckers are evil"

 

This PSA brought to you by Equifacks.
PMSL

Link to comment
Share on other sites

Link to post
Share on other sites

Holy cow, that's amazing!

 

Sadly it doesn't seem to support my 2060S (TU106), but I'll try add my GPU id on the script anyway and see how that fares. This would finally allow me to properly use Fusion 360 without having to mess with Wine or having subpar performance in a VM.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

wait what, I've been able to do gpu passthrough with my 5700xt since the day I got it? and geforce gpus are just now getting it?  

 

oh, nvm, should've read the article better.

AMD blackout rig

 

cpu: ryzen 5 3600 @4.4ghz @1.35v

gpu: rx5700xt 2200mhz

ram: vengeance lpx c15 3200mhz

mobo: gigabyte b550 auros pro 

psu: cooler master mwe 650w

case: masterbox mbx520

fans:Noctua industrial 3000rpm x6

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

32 minutes ago, James Evens said:

2. 2060 is supported while 2060 super not

Keep in mind that it's the TU104-based 2060 that works (the one usually called 2060 KO), not the regular TU106 2060.

 

Basically, it's faking the PCI ID of the GPU for a Tesla/Grid GPU that uses the same underlying chip.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

I want to see if you could effectively run a local network version of Grid or the equivalent to run a single or couple of high end Geforce RTX series GPUs to give enough gaming performance to a bunch of computers at once with the respective framerates. Basically a thin client enabled LAN setup of sorts, or even see if running breakout displays from the single card to multiple users is possible. I don't know if that level of segmentation for the display outputs is possible, but 2-4 gamers on 1 CPU AND 1 GPU would be pretty damn awesome!

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Letgomyleghoe said:

wait what, I've been able to do gpu passthrough with my 5700xt since the day I got it? and geforce gpus are just now getting it?  

 

oh, nvm, should've read the article better.

Yeah.  Lots of people confuse these two things.  Passthrough is great and in a world where GPU's were obtainable it was enough for consumers.  With GPU's become not just unobtainium but darn near applied phlebotium harder to obtain than a Warp drive, this should be part of the deal.  A high end GPU 70/80 level  or better should just get this. 

35 minutes ago, Jamtea said:

I want to see if you could effectively run a local network version of Grid or the equivalent to run a single or couple of high end Geforce RTX series GPUs to give enough gaming performance to a bunch of computers at once with the respective framerates. Basically a thin client enabled LAN setup of sorts, or even see if running breakout displays from the single card to multiple users is possible. I don't know if that level of segmentation for the display outputs is possible, but 2-4 gamers on 1 CPU AND 1 GPU would be pretty damn awesome!

Basically to set up a sort of .... gaming parlor with the lowest possible latency for computer gaming.  I could see that being useful to say team up in certain online multiplayer games.  GTA for example. 

 

OR, for an updated version of this one video . 

 

 

Where like we pretend that Linus can't just get a effing pallet of Quadro GPU's to do this on proper and have just one boss level computer running everything in his house. 

Link to comment
Share on other sites

Link to post
Share on other sites

I think that they really need to revisit this concept now with this hack. I guess the only thing truly holding it back from practicality is the lack of 120-144hz thin clients, but surely even something as simple as a SFF PC with a GTX 1650 would be able to get that connection working as a fat client to a TV or remote gaming station. Obviously Gsync would probably be out of the question, but there's no good reason I can think of as to why this shouldn't work in any instance where remote 60FPS gaming has worked well.

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, Jamtea said:

I think that they really need to revisit this concept now with this hack. I guess the only thing truly holding it back from practicality is the lack of 120-144hz thin clients, but surely even something as simple as a SFF PC with a GTX 1650 would be able to get that connection working as a fat client to a TV or remote gaming station. Obviously Gsync would probably be out of the question, but there's no good reason I can think of as to why this shouldn't work in any instance where remote 60FPS gaming has worked well.

I couldn't properly understand what you meant, but the vGPUs have frame limiting enabled by default:

Quote

When enabled, the frame-rate limiter (FRL) limits the maximum frame rate in frames per second (FPS) for a vGPU as follows:

  • For B-series vGPUs, the maximum frame rate is 45 FPS.
  • For Q-series, C-series, and A-series vGPUs, the maximum frame rate is 60 FPS.

Source

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

8 hours ago, James Evens said:

NVidia Grid which requires a pricey licences?

Yes.

 

https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/solutions/resources/documents1/Virtual-GPU-Packaging-and-Licensing-Guide.pdf

https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/partners/nvidia/vmware-nvidia-grid-vgpu-faq.pdf

 

So as far as I understand this only lowers the hardware cost?

 

But also:

Quote

Since the hack does not work with Windows and Vmware, it is useless for most users. 

Sooo.....

 

image.png.e46516d440bad77ff3201306879c5e17.png

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, igormp said:

I couldn't properly understand what you meant, but the vGPUs have frame limiting enabled by default:

Source

Yep, turning that off should result in best effort performance. IMO that should result in much higher frame rates being available to streaming devices, so long as they can display them of course.

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, leadeater said:

 

Reading the repo it is not clear to me if they have completely hacked NVIDIA GRID with this.  As you say it may be that they have simply fooled NVIDIA's software into seeing a GPU not on the list of those which are supported as one which is. 

 

10 hours ago, leadeater said:

 

But also:

Sooo.....

 

 Graphic showing only the KVM use case is supported

As this is a Linux driver hack that would appear to be the case.  Most of the community that would benefit from this are people who need or want to use Windows and Linux simultaneously on the same computer but who can't get two GPU's for whatever reason. 

 

Nvidia is worryingly opaque on just how much a license cost.  Looking at this document. ....It seems a gamer would need their virtual work station product to have access to CUDA and Open CL and a full fat experience that could game. https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/solutions/resources/documents1/Virtual-GPU-Packaging-and-Licensing-Guide.pdf

 

The cost of that license is also like buying a 3060ti at MSRP.  Plus they seem to want an ongoing support subscription just to use the darn thing. image.thumb.png.dac5caac8ef7dd03223dfd0add1ed74a.png

 

I could see paying 100 annual subscription or 250 perpetual license for this for a gamer version.  Paying those prices makes total sense if the GPU and VGPU are going to be used, say, to provide a desktop for your engineering staff hosted in your data center.  That the staff there are going to work on that new smart phone or something that will make back a million times that $450.   

Link to comment
Share on other sites

Link to post
Share on other sites

57 minutes ago, Uttamattamakin said:

Reading the repo it is not clear to me if they have completely hacked NVIDIA GRID with this.  As you say it may be that they have simply fooled NVIDIA's software into seeing a GPU not on the list of those which are supported as one which is. 

Yes, it's just a PCI ID spoof, mostly. There's a discord server discussing the development which can be found in a random issue on the repo.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, igormp said:

Yes, it's just a PCI ID spoof, mostly. There's a discord server discussing the development which can be found in a random issue on the repo.

That clears up a lot. 

 

NVIDIA getting some money out of this they shouldn't have a problem.  I just which wish they would give us a reasonably priced and supported option. 

Link to comment
Share on other sites

Link to post
Share on other sites

45 minutes ago, Uttamattamakin said:

I just which they would give us a reasonably priced and supported option.

There is, sorta, just be an academic institution and get academic pricing 😀

 

image.png.9f80603d9d7f4c6518352f9782beeaf8.png

Link to comment
Share on other sites

Link to post
Share on other sites

it does kinda suck that it halves your performance. so it's not like i can have 90% of the GPU be for a VM and only 10% for the host? 

"If a Lobster is a fish because it moves by jumping, then a kangaroo is a bird" - Admiral Paulo de Castro Moreira da Silva

"There is nothing more difficult than fixing something that isn't all the way broken yet." - Author Unknown

Spoiler

Intel Core i7-3960X @ 4.6 GHz - Asus P9X79WS/IPMI - 12GB DDR3-1600 quad-channel - EVGA GTX 1080ti SC - Fractal Design Define R5 - 500GB Crucial MX200 - NH-D15 - Logitech G710+ - Mionix Naos 7000 - Sennheiser PC350 w/Topping VX-1

Link to comment
Share on other sites

Link to post
Share on other sites

19 minutes ago, bcredeur97 said:

it does kinda suck that it halves your performance. so it's not like i can have 90% of the GPU be for a VM and only 10% for the host? 

You have different profiles, so you can pick any one. For my GPU (2060 Super spoofed as a T4), I have the following ones:

image.thumb.png.f6f0ad54dc22034f76905084126e377b.png

 

I left my VM with a T4-2Q profile, meaning 2gb of vram, 60fps max, and 8k max resolution.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×