Jump to content

Today, PC Gaming catches up to Consoles

DirectStorage, which lets your GPU access your SSD directly, has promised to make games better for years now. While it’s been on Xbox Series X since launch (and PS5 has its own version), now we’re finally getting it for PC. Why was it worth the wait?

 

 

Buy an AMD Ryzen 9 5950X: https://geni.us/nMA6el

Buy an ASUS Crosshair VIII Hero Wi-Fi: https://geni.us/B13k

Buy a GeForce RTX 3060: https://geni.us/WQMAcA

Emily @ LINUS MEDIA GROUP                                  

congratulations on breaking absolutely zero stereotypes - @cs_deathmatch

Link to comment
Share on other sites

Link to post
Share on other sites

How good will the SSDs need to be to take advantage of this? Wondering how my lowly WD Blue would fare.

Link to comment
Share on other sites

Link to post
Share on other sites

there is hope.

Useful threads: PSU Tier List | Motherboard Tier List | Graphics Card Cooling Tier List ❤️

Baby: MPG X570 GAMING PLUS | AMD Ryzen 9 5900x /w PBO | Corsair H150i Pro RGB | ASRock RX 7900 XTX Phantom Gaming OC (3020Mhz & 2650Memory) | Corsair Vengeance RGB PRO 32GB DDR4 (4x8GB) 3600 MHz | Corsair RM1000x |  WD_BLACK SN850 | WD_BLACK SN750 | Samsung EVO 850 | Kingston A400 |  PNY CS900 | Lian Li O11 Dynamic White | Display(s): Samsung Oddesy G7, ASUS TUF GAMING VG27AQZ 27" & MSI G274F

 

I also drive a volvo as one does being norwegian haha, a volvo v70 d3 from 2016.

Reliability was a key thing and its my second car, working pretty well for its 6 years age xD

Link to comment
Share on other sites

Link to post
Share on other sites

still issue with this concept anyhow .

MSI x399 sli plus  | AMD theardripper 2990wx all core 3ghz lock |Thermaltake flo ring 360 | EVGA 2080, Zotac 2080 |Gskill Ripjaws 128GB 3000 MHz | Corsair RM1200i |150tb | Asus tuff gaming mid tower| 10gb NIC

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, GabenJr said:

DirectStorage, which lets your GPU access your SSD directly, has promised to make games better for years now. While it’s been on Xbox Series X since launch (and PS5 has its own version), now we’re finally getting it for PC. Why was it worth the wait?

 

 

Buy an AMD Ryzen 9 5950X: https://geni.us/nMA6el

Buy an ASUS Crosshair VIII Hero Wi-Fi: https://geni.us/B13k

Buy a GeForce RTX 3060: https://geni.us/WQMAcA

Is this the end of gaming with a hard drive?

Link to comment
Share on other sites

Link to post
Share on other sites

I was watching a video about the Radeon pro ssg and I was wondering how would this work with the direct storage Microsoft is pushing? I know it’s completely ridiculous but just curious. Pretty sure they wouldn’t entertain it and make a newer version of it since it was ahead of its time. That puts another question on 3 parties being able to upgrade or trick the bios. Lol I’m pretty sure I’m reading too much into it!!

Link to comment
Share on other sites

Link to post
Share on other sites

Several times through the video it is stated that direct storage bypasses the ram to go straight to vram. This is not true, all assets streamed by direct storage follow the same path as traditionnaly streamed gpu assets: they need to be put in a d3d12 upload heap allocated on the ram before being sent to the vram, as this diagram shows:

https://github.com/microsoft/DirectStorage/blob/main/Docs/DeveloperGuidance.md#staging-buffers-and-copying

Link to comment
Share on other sites

Link to post
Share on other sites

I finally made an LTT forum account because I have a problem with this video. Microsoft has never stated that using DirectStorage results in assets streaming DIRECTLY from storage to the GPU as far as I can tell. This was presented as the definition of DirectStorage in the video, which is misleading.


Here is Microsoft’s original presentation on the Xbox’s fancy storage situation. DirectStorage is introduced at 5:38:

 Notice how DirectStorage is presented as a more efficient replacement for Win32 file APIs, to avoid bottlenecking the high speeds afforded by hardware accelerated decompression (which is a separate concept from DirectStorage). It never says that DirectStorage enables assets to flow directly from storage to GPU memory, because on consoles there is no difference between system memory and GPU memory in the first place. As stated in that video, DirectStorage was originally envisioned as a way to reduce (not eliminate) CPU overhead associated with graphics IO by replacing the general-purpose Win32 APIs with a more specialized and streamlined one, NOT as a way to avoid the CPU having to copy the raw data back and forth. In fact, the CPU doesn’t have to do that even without DirectStorage, as DMA already exists for that purpose. That segment from the LTT video showing all the game data “flowing” through the CPU is just wrong (unless I am mistaken, please correct me if so, I’m not an expert but that is my understanding).

 

Microsoft has never presented DirectStorage for Windows as being very different from the Xbox version. Surely if the PC version allowed skipping the step of loading data into system RAM before copying it to VRAM, they would have said so in their announcement blog, or one of the technical presentations they gave on how the Windows version works? There are 2 on YouTube that I watched on the Microsoft Game Dev channel and neither say this, in fact they contain diagrams like this: 8B5949E5-233F-4E81-923F-6D9B7514545C.thumb.png.85605c504a6ef60f609ccddd6aab56b5.png

 

…clearly showing data entering system RAM (“staging memory”) from the SSD before ultimately being copied to the GPU.

 

Nvidia’s RTX IO slide showing a data path from storage directly to the GPU seems to conflict with all that but remember that RTX IO is an additional API on top of DirectStorage, not DirectStorage itself, and also Nvidia’s wording throughout seems to suggest a “big reduction” in cpu usage, not elimination entirely. I’m not sure if RTX IO would actually enable data to flow from storage to the graphics card without passing through RAM first (Nvidia’s confusingly-named “GPUDirect Storage” can do this but it’s only supported on Linux and not with GeForce cards), but regardless, RTX IO isn’t released and won’t be for some time, and is also not DirectStorage.

 

Ultimately, I think the conclusion of the video is correct but essentially all of the explanation of what DirectStorage actually does in its current form is not.

Link to comment
Share on other sites

Link to post
Share on other sites

To be fair, having the GPU directly access storage is an interesting situation.

 

Personally I would have to question the security of the implementation. Letting a graphics engine prod around straight into system storage is ripe with abuse. And I wouldn't be surprised if ransomware starts employing this route to reach user files.

 

Secondly is the question if there is other paths towards better asset streaming.

Not going through the CPU seems a bit unnecessary, especially as CPUs and game engines moves towards more threads. Latency isn't majorly important and games should still rely on loading things preemptively instead of reactively, since access latency of modern Flash isn't all that favorable and we still have bandwidth to contend with. It is more a question of overall bandwidth rather than latency, and a lot of current games more or less needs to accept that some people still run their games from an HDD where 120 MB/s sequential reads is "fast", and this fact limits level design quite a lot.

 

To be fair, if game studios instead just said that an "SSD with 500+ MB/s read speed" is the minimum requirement, then that would likely solve more issues than direct asset streaming from storage. Though, stating that one needs to have hardware able to support NvME asset streaming is more or less saying that one needs a high performance SSD.

 

Lastly, if consumer CPUs supported scratchpad memory (ie using cache as memory, where said memory in cache doesn't correspond to a location out in main memory.) then some of the downsides of passing the data through the CPU would disappear since one doesn't have to push data out to main memory and could instead trans-compress it locally before sending it to the GPU. This would allow us to have "kernel" oversight over read/write permissions of storage, while not wasting memory bandwidth on the task. (however, here it would be nice if CPU vendors added perhaps a few hundred MB (or a couple of GB) of HBM memory to their CPUs for the scratchpad, since that would be more cost effective than building it directly on die. Scratchpad memory is also useful in a lot of other applications, but this feature is typically considered "high end server" stuff, and not for general consumers.)

Link to comment
Share on other sites

Link to post
Share on other sites

if game studios instead just said that an "SSD with 500+ MB/s read speed" is the minimum requirement, then that would likely solve more issues than direct asset streaming from storage. Though, stating that one needs to have hardware able to support NvME asset streaming is more or less saying that one needs a high performance SSD.

yes and no. that due to some mobo share bw of pci lanes and many manf ssd/m.2 are up to speeds.   also linus did a video on how manf part change on sku models tht can affect speeds of it to.

MSI x399 sli plus  | AMD theardripper 2990wx all core 3ghz lock |Thermaltake flo ring 360 | EVGA 2080, Zotac 2080 |Gskill Ripjaws 128GB 3000 MHz | Corsair RM1200i |150tb | Asus tuff gaming mid tower| 10gb NIC

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, dogwitch said:

if game studios instead just said that an "SSD with 500+ MB/s read speed" is the minimum requirement, then that would likely solve more issues than direct asset streaming from storage. Though, stating that one needs to have hardware able to support NvME asset streaming is more or less saying that one needs a high performance SSD.

yes and no. that due to some mobo share bw of pci lanes and many manf ssd/m.2 are up to speeds.   also linus did a video on how manf part change on sku models tht can affect speeds of it to.

Very few consumer systems would have both the GPU and storage connected to the chipset these days. (especially since practically every PC building guide recommends placing the GPU in the PICe slot connected to the CPU.)

 

And yes, some SSD vendors do make alterations to their product over time, for better or worse depending on the application. But we have as of yet not seen a major change in spec. Read speed can be a bit lower for some transfer types, but the difference typically isn't all that major. But this would effect things regardless if we have direct storage or not.

 

One thing I do however see as impactful is IPC, and how it affects overall system responsiveness in regards to thread switching. (IPC as in Inter Process Communication, not instructions per cycle.)

Since the thread making the call out to storage will have to wait anywhere from a bit bellow a µs up to the hundreds of µs before the storage responds. By this point our thread hasn't been doing anything for long enough for the Kernel to hibernate it long before storage responds, and here it can take a fair bit more time before the kernel can switch it back in. During this time the fresh data from storage that currently lingers in cache can end up flushed out to main memory, so when our storage reading thread gets back it can stall every now and then as it gets a good few cache misses.

 

So far I haven't seen any architecture that supports more intelligent hibernation of threads. (Though, we could flag the thread as "important" and just run it regardless if it stalls or not, and this can greatly improve systems performance in these kinds of situations with the deficit that the thread is taking up core space and doesn't do anything most of the time. ie one effectively trades a hardware thread for responsiveness. In some server/HPC applications this is actually done.) More intelligent thread hibernation would allow our kernel to put the thread to the side but still keep it in the CPU, where it has a trigger set to respond on some condition, in this case that the storage responds. When this happens our CPU will "instantly" switch without the kernel's intervention, putting the thread back into core and the thread that we ran in the meantime gets placed to the side for our kernel to handle. This ensures that we don't add unnecessary delay to our operation since we don't have to wait for the kernel, and we also have the benefit of still having our data in cache. (However, "just implementing" this feature into a CPU wouldn't just work, since the kernel needs to support it. It is however a bit similar to an interrupt, something the Kernel and a few other tasks already do use. But interrupts have their own can of worms, and X86 is fairly limited in its interrupt capabilities and most OSes exclusively use it for the kernel (and some specific peripherals like the keyboard and mouse), since the interrupt table is in Kernel space memory and on X86 it can more or less only have 256 entries. So we can't really hand these out to every process, though the OS's storage service should honestly have an interrupt to better handle modern low latency storage.)

Link to comment
Share on other sites

Link to post
Share on other sites

At around 2:47 he says Direct Storage produces nearly a 3× speedup, but it looks like it's actually over 4×? 0.33/0.08=4.125. Am I misinterpreting?

Link to comment
Share on other sites

Link to post
Share on other sites

On 3/26/2022 at 7:01 PM, Error 52 said:

How good will the SSDs need to be to take advantage of this? Wondering how my lowly WD Blue would fare.

Well in theory it should work on any SSD but maybe game developers would optimize their games to atleast - lets say - +5000 MB/s read speeds and then PCI-E 4 NVME's would be necessary. Only time will tell. It will probably vary from game to game also. 

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×