Jump to content

Run Games on VRAM

Found this guy on Twitter running Crysis 3 on his GPU´s VRAM.
It actually turned out playable from what i can see and read.
Source of that

But to ask, since i dont have a big enough VRAM and such, if someone can test other Games with it.
A setup and the Program itself can be found on the Github of the Maker

----------------------------------------------------------------------------------
Some rando girl that like programming and graphics design

i also read a lot of books and like anime
----------------------------------------------------------------------------------

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Azariel-chan said:

Found this guy on Twitter running Crysis 3 on his GPU´s VRAM.

From the GitHub page: "Using GPU RAM isn't as fast as host main memory, however it is still faster than a regular HDD."

 

This about matches what I expected, since the system probably has to load data from VRAM into RAM, to load it back into VRAM when loading the game. The OS has no way of knowing the required data is already coming from there, so it simply treats it like a regular disk. Meaning you're better off using a normal RAM disk.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

23 minutes ago, Eigenvektor said:

From the GitHub page: "Using GPU RAM isn't as fast as host main memory, however it is still faster than a regular HDD."

 

This about matches what I expected, since the system probably has to load data from VRAM into RAM, to load it back into VRAM when loading the game. The OS has no way of knowing the required data is already coming from there, so it simply treats it like a regular disk. Meaning you're better off using a normal RAM disk.

It may be faster than an HDD, since its RAM, but it isnt as fast as SSD Standard from what i heard/seenspacer.png

 

Regardless, it still may be very Interesting compairing GDDR5/X and GDDR6/X with each other and below

----------------------------------------------------------------------------------
Some rando girl that like programming and graphics design

i also read a lot of books and like anime
----------------------------------------------------------------------------------

Link to comment
Share on other sites

Link to post
Share on other sites

This is a little better than a SATA SSD but NVMe is faster. And you usually already can't see a difference in game load times between a SATA SSD and NVMe anyway...

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

Although between GDDR5 and 6 etc would be interesting

----------------------------------------------------------------------------------
Some rando girl that like programming and graphics design

i also read a lot of books and like anime
----------------------------------------------------------------------------------

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Azariel-chan said:

Although between GDDR5 and 6 etc would be interesting

This test would most likely be bottlenecked by PCIe, so you wouldn't actually be comparing the performance of VRAM. PCIe 4.0 x16 has a maximum theoretical throughput of 31.5 GB/s. The 3080's VRAM has a throughput of 760 GB/s.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

22 minutes ago, Eigenvektor said:

This test would most likely be bottlenecked by PCIe, so you wouldn't actually be comparing the performance of VRAM. PCIe 4.0 x16 has a maximum theoretical throughput of 31.5 GB/s. The 3080's VRAM has a throughput of 760 GB/s.

Well yes, but it would still be interesting in seeing it Compared. Numbers are there, but real time isnt mostly the same like the numbers

----------------------------------------------------------------------------------
Some rando girl that like programming and graphics design

i also read a lot of books and like anime
----------------------------------------------------------------------------------

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Azariel-chan said:

Well yes, but it would still be interesting in seeing it Compared. Numbers are there, but real time isnt mostly the same like the numbers

What I'm trying to say is that you most likely wouldn't see any difference between GDDR5 and GDDR6 because the PCIe bus is the limiting factor.

 

It doesn't really matter whether one card has 760 GB/s and the other card only has 480 GB/s when you're trying to compare their speed by transferring data over the PCIe bus that is limited at 31.5 GB/s. That's a bit like trying to compare the speed of race car with a family car by having them "race" through a pedestrian zone (while obeying traffic laws). If anything, you might see a difference between PCIe 3.0 and 4.0.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Eigenvektor said:

What I'm trying to say is that you most likely wouldn't see any difference between GDDR5 and GDDR6 because the PCIe bus is the limiting factor.

 

It doesn't really matter whether one card has 760 GB/s and the other card only has 480 GB/s when you're trying to compare their speed by transferring data over the PCIe bus that is limited at 31.5 GB/s. That's a bit like trying to compare the speed of race car with a family car by having them "race" through a pedestrian zone (while obeying traffic laws). If anything, you might see a difference between PCIe 3.0 and 4.0.

Then lets do that shall we. Sadly dont have any 4.0 Mobo here

----------------------------------------------------------------------------------
Some rando girl that like programming and graphics design

i also read a lot of books and like anime
----------------------------------------------------------------------------------

Link to comment
Share on other sites

Link to post
Share on other sites

Neither the PCIe link nor the mem bandwidth is the limit and by far, it's the code that's running to achieve that.

 

2080S getting around 3GB/s here, but PCIe3 x16 is capable of around 16 and the memory itself waaay more than that.

 

Not to mention that it uses 30% GPU. 

 

image.png.9cf483498c576762026a46d76b7470fd.png

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, Kilrah said:

Neither the PCIe link nor the mem bandwidth is the limit and by far, it's the code that's running to achieve that.

 

2080S getting around 3GB/s here, but PCIe3 x16 is capable of around 16 and the memory itself waaay more than that.

 

Not to mention that it uses 30% GPU. 

 

image.png.9cf483498c576762026a46d76b7470fd.png

Can you try a Benchmark or Game?

----------------------------------------------------------------------------------
Some rando girl that like programming and graphics design

i also read a lot of books and like anime
----------------------------------------------------------------------------------

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Azariel-chan said:

Can you try a Benchmark or Game?

Try what? There's no metric for storage performance in games, plus it's pointless since no benchmark or game is storage bound. And no I can't since a benchmark or game would need the VRAM for itself...

 

As reference here's my SSD, even though it's 90+% full...

 

image.png.5a3a75af3a5326a438a63087e35a5e57.png

 

The whole point of the improvements like directstorage is to reduce the processing path to speed things up, and here this is increasing it.

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Kilrah said:

Try what? There's no metric for storage performance in games, plus it's pointlesss since no benchmark or game is storage bound. And no I can't since a benchmark or game would need the VRAM for itself...

Agreed. Here's the result for my RX 480 (8 GB) on PCIe 3.0 x16. Pretty similar to the 2080S. I've only used 4 GB for the virtual drive, theoretically leaving 4 GB of VRAM available for games.

 

image.png.e5bc65f52d42bd2db3991bd70a8c186d.png

 

Like @Kilrah I don't quite see the point. It reduces the amount of VRAM available to the game itself and would most likely increase load times compared to my SSD. If anything performance would be worse and the game would also be gone after a reboot. So any game I test would have to be at most 4 GB in size (installed) and would also be limited to 4 GB VRAM.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

Not to mention that IF a RAMdisk was actually useful you'd make it in system RAM, which is MUCH faster to access by the CPU, and... well, you can get 16GB of it for <$70, while getting a GPU in which you can put a 16GB RAMdisk costs you an extra $300-$800 or so :)

 

ram.png.bf895874cac4f4ce771edc914085f8d3.png

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

Idk how to quote a post from a different locked topic, but @Eigenvektor sent me here, I was wanting to reply to 

...

 

Anyway though... I thought the GPU VRAM was supposed to be capable of several hundred GBytes/sec?  (Wonder if the CDM files themselves could be installed on the VRAM, and the program run on the GPU so you wouldn't even have to go through the PCIe slot with its bandwidth bottleneck...)

 

Also there's another metric of storage performance I've been wanting to discuss for a while, that I feel has been neglected - that being time to write an entire disk.  That's better posted in its own topic, but as a little spoiler, I've heard of some 40MB IDE HDDs (according to a 14 or so year old Tom's Hardware article which was, among others, referencing a drive that was then 15 years old) writing to the entire disk in under a minute, and I imagine the earliest MFM drives may have filled up even faster possibly.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, PianoPlayer88Key said:

Anyway though... I thought the GPU VRAM was supposed to be capable of several hundred GBytes/sec?  (Wonder if the CDM files themselves could be installed on the VRAM, and the program run on the GPU so you wouldn't even have to go through the PCIe slot with its bandwidth bottleneck...)

The OS doesn't know data is already coming from VRAM, so it can't do a simply VRAM-to-VRAM copy. It will load the data from VRAM over the PCIe bus through the CPU into RAM, before moving it back over the PCIe bus into VRAM.

 

Meaning your first bottleneck is the PCIe bus. E.g. PCIe 4.0 x16 limits the theoretical speed down to 31.5 GB/s. The second, far larger, bottleneck is the software emulating the disk drive. It decreases the speed to what you see in CrystalDiskMark above and generates additional load on the CPU and GPU to do so.

 

You can't really "run" a program on VRAM. The program is merely installed on VRAM by treating it as a disk drive. It's still the CPU that has to do (pre)processing like decompressing the data, before the GPU can use it. So overall you lose performance, since you put additional load on the PCIe bus, CPU and GPU.

 

In the future you'll get more of a speedup from technologies like DirectStorage (e.g. RTX IO) where the GPU can load data directly from a disk drive, without having to go through the CPU and RAM first.

 

5 hours ago, PianoPlayer88Key said:

Also there's another metric of storage performance I've been wanting to discuss for a while, that I feel has been neglected - that being time to write an entire disk.

Filling a 100 MB HDD at 10 MB/s will take 10 seconds.

Filling a 1 TB SSD at 500 MB/s will take ~33 minutes.

 

It doesn't take longer because drives got slower, but simply because they grew so much larger.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Eigenvektor said:

Filling a 100 MB HDD at 10 MB/s will take 10 seconds.

Filling a 1 TB SSD at 500 MB/s will take ~33 minutes.

 

It doesn't take longer because drives got slower, but simply because they grew so much larger.

Yeah true ...

Spoiler

 

that's basically a paraphrase of part of a Tom's Hardware article I was referrring to - capacities outran performance, among other things they said.  I'll link the page for time to write a full platter...
https://www.tomshardware.com/reviews/15-years-of-hard-drive-history,1368-8.html

I'd like to see a new metric for SSDs, PCIe bus, etc ... SPDW (Seconds per Drive Write), less is better, instead of MBPS.  Like, if future versions of PCIe / NVMe / whatever would ALWAYS write a full disk, no matter the capacity (whether it be 8TB, 16TB, 32TB, 64TB, etc) in the same amount of time it would theoretically take to write a 5MB MFM hard drive (ST-506 I think was capable ot 5Mbit or 625 KB / sec, so about 8 seconds to fill a 5MB drive theoretically, assuming it saturated it), I would be a lot happier than the day or two it takes to do some stuff on my 10TB and 14TB 7200rpm non-SMR hard drives.  (I was hearing that some people with SMR drives were reporting some things were gonna take like a couple WEEKS!!)
I suppose you could use the DWPD metric that some SSDs use to quote endurance, but use it to quote drive performance.  For example once they get up to speed, something like 10800 DWPD might be considered okay.  (tha'ts not the word I want to use though, i'm trying to think of one that starts with "a" and has 3 or 4 syllables, but I don't think "adequate" or "appropriate" are the word I'm looking for.)


I've actually been wanting to get my paws on like a 40MB or 80MB IDE drive and benchmark it (I have an IDE controller, but haven't found a cheap / <$10 MFM controller that works in a PCIe slot)... but all I'm seeing on ebay are either dead ones around $20-30, or working ones from known brands (like Seagate, Western Digital, Maxtor) upwards of $100+, or brands I'm not as familiar with (but still recognize) for around $40-50, and I'm not about to pay that steep of a premium for one of those drives, in price per capacity, compared to a modern 12+TB hard drive. (My budget for one would be such that, assuming shipping is reasonable like $5 or $10, I'd be paying like 2x more for shipping than for the drive.  I did see a Connor CP-344 which I think was one of the earliest 40MB IDE HDDs for like $40 on ebay, but that's pretty steep for me for that.)

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

23 minutes ago, PianoPlayer88Key said:

Yeah true ...

You should really make a new topic for this, instead of hijacking this one.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

So i found this on twitter and the op said @Linus should have 2 or 3 rtx 3080s or one or 2 3090s by now.

So why not nvlink them and use the unified memory to install something huge perhaps something like gta 5 or a heavily modded skyrim se or witcher 3?

Capture.PNG.7fa5defb3af8bbc0b817faee3d237ae5.PNG

EjcbGz5XkAEtmID.png

EjcetWRX0AARfKd.jpg

EjcgVGaXgAMHk9Z.jpg

EjcgWQJXcAED8jy.png

specs in spoiler.

Spoiler

 

Motherboard

Asrock A320M

CPU 

Ryzen 5 2600X 

GPU/s

2x MSI RX580 Gaming X 8GB

RAM

channel 1 Corsair Vengence LPX  DDR4 2133MHz 2x4GB

channel 2 Kingston HyperX Predator DDR4 3000MHz 2x4GB

Cooler

AMD Wraith Max RGB Cooler

Case

CiT Seven

Storage Internal Only.

SSD  Crucial BX500 120GB

HDD Segate ST2000L 2TB

Monitor/s         

LG UltraWide 25UM58

Acer XZ350CU 35" 21:9 Ultrawide

LG UltraWide 25UM58

Mouse

Logitech 203 Prodigy Wired

Keyboard

Logitech 213 Prodigy Wired

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

You dont run NVLink in games, it's SLI which cannot combine VRAM. Even the connector is the wrong way around

CPU: i7-2600K 4751MHz 1.44V (software) --> 1.47V at the back of the socket Motherboard: Asrock Z77 Extreme4 (BCLK: 103.3MHz) CPU Cooler: Noctua NH-D15 RAM: Adata XPG 2x8GB DDR3 (XMP: 2133MHz 10-11-11-30 CR2, custom: 2203MHz 10-11-10-26 CR1 tRFC:230 tREFI:14000) GPU: Asus GTX 1070 Dual (Super Jetstream vbios, +70(2025-2088MHz)/+400(8.8Gbps)) SSD: Samsung 840 Pro 256GB (main boot drive), Transcend SSD370 128GB PSU: Seasonic X-660 80+ Gold Case: Antec P110 Silent, 5 intakes 1 exhaust Monitor: AOC G2460PF 1080p 144Hz (150Hz max w/ DP, 121Hz max w/ HDMI) TN panel Keyboard: Logitech G610 Orion (Cherry MX Blue) with SteelSeries Apex M260 keycaps Mouse: BenQ Zowie FK1

 

Model: HP Omen 17 17-an110ca CPU: i7-8750H (0.125V core & cache, 50mV SA undervolt) GPU: GTX 1060 6GB Mobile (+80/+450, 1650MHz~1750MHz 0.78V~0.85V) RAM: 8+8GB DDR4-2400 18-17-17-39 2T Storage: HP EX920 1TB PCIe x4 M.2 SSD + Crucial MX500 1TB 2.5" SATA SSD, 128GB Toshiba PCIe x2 M.2 SSD (KBG30ZMV128G) gone cooking externally, 1TB Seagate 7200RPM 2.5" HDD (ST1000LM049-2GH172) left outside Monitor: 1080p 126Hz IPS G-sync

 

Desktop benching:

Cinebench R15 Single thread:168 Multi-thread: 833 

SuperPi (v1.5 from Techpowerup, PI value output) 16K: 0.100s 1M: 8.255s 32M: 7m 45.93s

Link to comment
Share on other sites

Link to post
Share on other sites

Ram drives have been around for a very very long time. I don't why this story got so much attention. It is completely pointless. My NVME RAID is 3 times faster than that.

Link to comment
Share on other sites

Link to post
Share on other sites

20 minutes ago, Ryzenuser19 said:

@Linus

That's not Linus. His @ is linustech

Either @piratemonkey or quote me when responding to me. I won't see otherwise

Put a reaction on my post if I helped

My privacy guide | Why my name is piratemonkey PSU Tier List Motherboard VRM Tier List

What I say is from experience and the internet, and may not be 100% correct

Link to comment
Share on other sites

Link to post
Share on other sites

i tested this myself (not with crysis 3 though) its funny that this thing is even possible 

if it was useful give it a like :) btw if your into linux pay a visit here

 

Link to comment
Share on other sites

Link to post
Share on other sites

I don't see the point.

 

Any file would have to be transferred from vram to system ram through pci-e bus (which should be fast due to dma  transfers), then game executable does crap with the read stuff (decompresses textures, compiles shaders, buffers audio tracks, and then prepared stuff is uploaded back in vram

 

So it's just pointless locking of vram memory, it's not like everything is done within the card.

 

DDR4 is cheap, 64 GB is $200 .. and is capable of 20+ GB/s reads. You get same shit using regular ram

 

see how some cheap ddr4 2133 mhz can still do up to 5x better than nvme ssd:

 

image.png.c49e39fddd12c0b153b442ad4bd09a7a.png

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, dilpickle said:

Ram drives have been around for a very very long time. I don't why this story got so much attention

Just because it wasn't done on a GPU before... 

F@H
Desktop: i9-13900K, ASUS Z790-E, 64GB DDR5-6000 CL36, RTX3080, 2TB MP600 Pro XT, 2TB SX8200Pro, 2x16TB Ironwolf RAID0, Corsair HX1200, Antec Vortex 360 AIO, Thermaltake Versa H25 TG, Samsung 4K curved 49" TV, 23" secondary, Mountain Everest Max

Mobile SFF rig: i9-9900K, Noctua NH-L9i, Asrock Z390 Phantom ITX-AC, 32GB, GTX1070, 2x1TB SX8200Pro RAID0, 2x5TB 2.5" HDD RAID0, Athena 500W Flex (Noctua fan), Custom 4.7l 3D printed case

 

Asus Zenbook UM325UA, Ryzen 7 5700u, 16GB, 1TB, OLED

 

GPD Win 2

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×