Jump to content

Just found this article and video about installing Crysis, not on a RAM disk, but on a VRAM disk, which is many times faster than RAM, like RAM is many times faster than even the fastest SSD.

 

What do you all think about this?

 

https://gamingph.com/2020/10/you-can-actually-install-crysis-3-directly-on-rtx-3090-vram-as-storage/

 

 

Link to comment
https://linustechtips.com/topic/1585426-installing-games-on-vram/
Share on other sites

Link to post
Share on other sites

VRAM is fast but only if the data is consumed locally on GPU without needing to ever go off it. Anything else would have to make a trip to the CPU and be limited by PCIe bus which is generally slower than system ram.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, MSI Ventus 3x OC RTX 5070 Ti, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Alienware AW3225QF (32" 240 Hz OLED)
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 4070 FE, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, iiyama ProLite XU2793QSU-B6 (27" 1440p 100 Hz)
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to post
Share on other sites

3 hours ago, porina said:

VRAM is fast but only if the data is consumed locally on GPU without needing to ever go off it. Anything else would have to make a trip to the CPU and be limited by PCIe bus which is generally slower than system ram.

With that, I remembered a very long time question I have which I don't think I have found the answer to it yet. Does this mean that integrated graphics have much faster access to system RAM than discrete GPUs? Is that exactly the reason when discrete GPUs run out of video memory, their performance absolutely cripples, where integrated GPUs anyways run fine through the system memory, and sometimes might even provide better performance when compared to a VRAM overflowing discrete GPU situation?

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to post
Share on other sites

1 hour ago, Haswellx86 said:

Does this mean that integrated graphics have much faster access to system RAM than discrete GPUs? Is that exactly the reason when discrete GPUs run out of video memory, their performance absolutely cripples, where integrated GPUs anyways run fine through the system memory, and sometimes might even provide better performance when compared to a VRAM overflowing discrete GPU situation?

Specifically talking about desktop socketed CPUs with iGPUs, the iGPU bandwidth is basically the system ram bandwidth. If the dGPU wants to get to system ram, it goes via PCIe. Depending on the system either PCIe or ram could be the ultimate limit. PCIe 4.0 x16 unidirectional bandwidth is about 32GB/s, or about equal to single channel 4000 MT/s ram. So most likely PCIe will limit.

 

Even if the PCIe limit wasn't there at all, system ram is still very much slower than dGPU VRAM. Dual channel 6400 MT/s ram would give you ~100GB/s. Picking an older mainstream GPU, the 3060 as example, that has a nominal bandwidth of 360GB/s. That's still a pretty big gap.

 

This is in part why iGPUs don't really get fast. They're constrained by the ability of system ram to feed it. In the mobile space, LPDDR goes a bit faster than regular DDR so that helps a bit. However there's another class of iGPU, those integrated closely with CPUs which have much faster memory systems. This includes the Apple M series, the stuff in current gen consoles, and upcoming AMD Strix Halo. The latter is rumoured/leaked to have equivalent to quad channel 8533 ram giving potential raw bandwidth entering the mainstream dGPU space (~266GB/s). Don't expect it to be socketed, cheap or low power.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, MSI Ventus 3x OC RTX 5070 Ti, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Alienware AW3225QF (32" 240 Hz OLED)
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 4070 FE, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, iiyama ProLite XU2793QSU-B6 (27" 1440p 100 Hz)
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to post
Share on other sites

5 minutes ago, porina said:

Dual channel 6400 MT/s ram would give you ~100GB/s

That is only pure raw bandwidth. The 14900K itself has a max real bandwidth of ~90 GB/s.

 

7 minutes ago, porina said:

So most likely PCIe will limit.

Doesn't it also has to go through the CPU's memory controller? That, with other overheads included.

 

You didn't state it clearly, but is that why a discrete GPU's performance absolutely cripples when its memory is overflowed, whereas an I-GPU might just provide better performance, because it can access the system memory much faster? The memory bus is much faster than the PCIe, I know that. Latency also matters.

PLEASE MARK COMMENTS AS SOLUTION IF SATISFIED!!

bigger number better, makes me look cooler.

Link to post
Share on other sites

26 minutes ago, Haswellx86 said:

That is only pure raw bandwidth. The 14900K itself has a max real bandwidth of ~90 GB/s.

Do you want to go and measure the practical bandwidth of VRAM too? It might not be exact, but I'm comparing like for like.

 

26 minutes ago, Haswellx86 said:

You didn't state it clearly, but is that why a discrete GPU's performance absolutely cripples when its memory is overflowed, whereas an I-GPU might just provide better performance, because it can access the system memory much faster? The memory bus is much faster than the PCIe, I know that. Latency also matters.

I don't know if a socketed iGPU could be faster in that specific scenario. I wonder if anyone has actually tested that. The practical answer would be it doesn't matter, because you'd have to use unrealistic settings to demonstrate the effect. Basically the iGPU would be so slow anyway it would still not be usable even if it were faster than dGPU.

 

Latency doesn't matter much. When you're moving GB of data around most likely sequentially, a few ns access time difference here or there is relatively insignificant compared to bandwidth effects. GPU ram tech like GDDR and HBM are supposed to have higher latency than regular DDR, although I've not looked at the actual numbers for it.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, MSI Ventus 3x OC RTX 5070 Ti, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Alienware AW3225QF (32" 240 Hz OLED)
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 4070 FE, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, iiyama ProLite XU2793QSU-B6 (27" 1440p 100 Hz)
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×