Jump to content

Tim Sweeney explains his comments with respect to IO pef on PS5

hishnash
15 minutes ago, Sauron said:

That's kind of my point, a game running smoothly isn't necessarily an indication of the hardware performing beyond expectations.

No, your point was this BS you said: 

"The upcoming console is going to be somehow faster than computers with better specs because magic optimization" has been done before and it was nonsense. I won't believe it until I see a benchmark.

 

Same card, same teraflops, on console and PC behave EXTREMELY differently and are EXTREMELY more optimized on consoles which is why to have the same graphic fidelity to the eye you "need" more flops on computer. Optimization makes hardware behave better than expected, that's the whole point of it.

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, hishnash said:
 

 

they (at least the PlayStation) have radially different operating systems. If you'r thinking about optimisation that is in the OS for these things.

 

Its not even that simple, on a PC you have all sort of other shit running, (maybe windows is downloading an update in the background). the Software optimisations in the operating system are just as important (if not more) for having a predictable throughput. 

 

It is that simple. Nothing is ever on absolute edge of performance to be hindered so badly a background task would totally screw it up. Again, sure, that's more predictable on console that has no background tasks, but PC's aren't an issue either. Task manager is good enough with these things to keep priorities in check these days.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, 3rrant said:

No, your point was this BS you said: 

"The upcoming console is going to be somehow faster than computers with better specs because magic optimization" has been done before and it was nonsense. I won't believe it until I see a benchmark.

 

Same card, same teraflops, on console and PC behave EXTREMELY differently and are EXTREMELY more optimized on consoles which is why to have the same graphic fidelity to the eye you "need" more flops on computer. Optimization makes hardware behave better than expected, that's the whole point of it.

Seems like you doesn't understand the difference between hardware and software performance. The claim here is that the ssd in the ps5 will have, in practice, better IO speeds than on paper faster drives. I take that with a grain of salt because, again, this is something that has been lied about before. This says nothing on whether a properly optimized game won't still run well on it.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, Sauron said:

Seems like you doesn't understand the difference between hardware and software performance. The claim here is that the ssd in the ps5 will have, in practice, better IO speeds than on paper faster drives. I take that with a grain of salt because, again, this is something that has been lied about before. This says nothing on whether a properly optimized game won't still run well on it.

On PC you just use different memory management scheme for the game, nothing stops you from pulling in assets and components in to system memory before it's needed but you know it will be. This will require more memory than the console version but you can preemptively pull data from storage before it's needed to negate real time latency impacts of a less direct DMA path.

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, Sauron said:

Seems like you doesn't understand the difference between hardware and software performance. The claim here is that the ssd in the ps5 will have, in practice, better IO speeds than on paper faster drives. I take that with a grain of salt because, again, this is something that has been lied about before. This says nothing on whether a properly optimized game won't still run well on it.

Well, we know Optane drives, due to nature of their storage chips do have better I/O performance because of lower latencies, higher IOPS, durability is also higher.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Sauron said:

Seems like you doesn't understand the difference between hardware and software performance. The claim here is that the ssd in the ps5 will have, in practice, better IO speeds than on paper faster drives. I take that with a grain of salt because, again, this is something that has been lied about before. This says nothing on whether a properly optimized game won't still run well on it.

Hardware performance does exist. There is no hardware performance in games, it's all software and engine related. Not silicon level. Also the only thing that can come close to how the PS5 SSD complex work is non volatile Intel memory, in theory.

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, 3rrant said:

Hardware performance does exist. There is no hardware performance in games, it's all software and engine related. Not silicon level. Also the only thing that can come close to how the PS5 SSD complex work is non volatile Intel memory, in theory.

Well no, not really. Software can be made to run well but if the hardware isn't fast enough then you need to find other ways of getting that performance. You can cache things in memory more aggressively if the SSD isn't that fast for example but then you're limited by memory capacity and the claim that the ssd is faster is still not true. Again, maybe it's true that we have an optane situation where in practice the drive offers better performance than you'd expect - I just wouldn't take Sweeney's word for it.

Don't ask to ask, just ask... please 🤨

sudo chmod -R 000 /*

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, RejZoR said:

It is that simple. Nothing is ever on absolute edge of performance to be hindered so badly a background task would totally screw it up. Again, sure, that's more predictable on console that has no background tasks, but PC's aren't an issue either. Task manager is good enough with these things to keep priorities in check these days.

No tasks that are not latancy senstaive yes. but if you need to have unltra low latanacy then the normal sceduler of a windows kernal is very bad. There are kernals (notable low latancy, real-time, linux kernals that are used in audio lot of digital equipement, and macOS to a lesser degree) that have chossed a differnt apraoch to enable much lower latancy. MS could do this but they dont, most likly due to backward compatibly gun that is always at thier head.

 

 

1 hour ago, Sauron said:

Well no, not really. Software can be made to run well but if the hardware isn't fast enough then you need to find other ways of getting that performance. You can cache things in memory more aggressively if the SSD isn't that fast for example but then you're limited by memory capacity and the claim that the ssd is faster is still not true. Again, maybe it's true that we have an optane situation where in practice the drive offers better performance than you'd expect - I just wouldn't take Sweeney's word for it.

Again its less about the raw GB/s and more about the latancy, yes you could make (even 2 year old  hardware run that fast on a PC) but it would not be running an OS that you can game on.

To get a windows system to anywere close in latancy you would require a massive increese in CPU speed compared to the best NL2 overlockers today in the order of 10x instruction per seceond (and yes we are talking single core speed, extra cors does not help in this task). What they are doing on the playstation is all about avoiding needing to do use/wast the CPU time for this task. 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, hishnash said:

No tasks that are not latancy senstaive yes. but if you need to have unltra low latanacy then the normal sceduler of a windows kernal is very bad. There are kernals (notable low latancy, real-time, linux kernals that are used in audio lot of digital equipement, and macOS to a lesser degree) that have chossed a differnt apraoch to enable much lower latancy. MS could do this but they dont, most likly due to backward compatibly gun that is always at thier head.

 

 

Again its less about the raw GB/s and more about the latancy, yes you could make (even 2 year old  hardware run that fast on a PC) but it would not be running an OS that you can game on.

To get a windows system to anywere close in latancy you would require a massive increese in CPU speed compared to the best NL2 overlockers today in the order of 10x instruction per seceond (and yes we are talking single core speed, extra cors does not help in this task). What they are doing on the playstation is all about avoiding needing to do use/wast the CPU time for this task. 

You're talking audio latency. Things like this are irrelevant for gaming afaict. It can't hurt but isn't really required to be that timely. It just has to have huge bandwidth and low latency so it can load assets in real-time as player is moving around the level.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, RejZoR said:

You're talking audio latency. Things like this are irrelevant for gaming afaict. It can't hurt but isn't really required to be that timely. It just has to have huge bandwidth and low latency so it can load assets in real-time as player is moving around the level.

that tim is talking about is disk ascess latancy and that very important if your game assests are way larger than your vram. you need to be able to load data on demand otherwise you have massive fram stuttering! it is critical for the nanite tec to work were a single modrl on disk is larger than your VRam so as tge camara moves you get new data.

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, RejZoR said:

You're talking audio latency. Things like this are irrelevant for gaming afaict. It can't hurt but isn't really required to be that timely. It just has to have huge bandwidth and low latency so it can load assets in real-time as player is moving around the level.

Very high quality realtime audio is a total game changer for the feeling of games and not only. In VR, it's an absolute must to push the industry forward really immersive experiences. The idea of having a completely separate audio engine complex on PS5 is great (Tempest) and from what they've told, it's really a powerful hardware that they've integrated in the system, far more powerful than any add-in card (this doesn't mean it's equally or better responsive to all frequencies tho).

 

However, the latency talk is not about audio but assets. There's overhead in each step of the process (that OP thankfully resumed in his edit). On PC, that latency stacks up by very small bits at a time but for many many operations before rendering the needed pixels. By cutting the processing units in the way and using direct access interfaces, latency is (theorically) extremely reduced.

Link to comment
Share on other sites

Link to post
Share on other sites

I feel like somthing important is getting missed in this latency debate, even on the relatively high latency PC form where still talking an operation that takes a tiny fraction of a second. That means the difference only matters when the timeframe in which the GPU needs the asset is less than that tiny fraction of a second. That implies that the PS5 must be preloading absolutely nothing not currently on screen into it's ram. Which is a great way to keep the necessary ram size down. But it also means that this entire fancy system merely lets te SSD act as what amounts to regular, (in PC terms), RAM. When the same data the SSD is streaming is instead sat neatly in your system ram waiting for the GPU to ask for it the PS5's storage solutions purported advantages take a sharp nosedive. 

 

I'm sure someone's going to go, "but carl bandwidth..." Tell me how many current games are even remotely hampered by even regular HDD speeds? And you can't just go scaling up the asset size, because at the end of the day the GPu still has t be able to process it and even with heavy optimizations there's just a really hard limit on how far you can make the assets more detailed before the GPU can't render it anymore. 

 

Where this is really going to help the PS5 is in a couple of years or so. By that point all but the potato of potato systems are going to be running somthing around the pascal area at the worst. That means 6, more often 8gb of VRAM is going to be the low end norm and likely 16gb or more of system RAM. The high end, (best guess we won't know till it gets here), will be rocking 24-32gb of VRAM and 64GB+ of system ram, (probably with 32 core or more CPU's if i had to guess). The PS5 with it's partly 16Gb of ram shared between regular and VRAM usage would be totally screwed, (with a rusty spacecraft up it's backside... ...sideways) without this. You can optimize all you want, but if your assets and CPU side stuff won't physically fit in ram all the optimization in the world won't help you. But this cuts down the amount of stuff that they need in there a lot which lets them push further. In effect t will probably give the PS5 more breathing room before it gets completely hosed.

Link to comment
Share on other sites

Link to post
Share on other sites

Tighter storage integration, removing bottlenecks elsewhere definitely will bring quite the improvements compared to just raw SSD speeds. Takes time and processing power to load and transfer the needed data and send it where it's needed. I'm excited to see more of this, especially in PC space eventually. Have SSD will feel even better. 

| Ryzen 7 7800X3D | AM5 B650 Aorus Elite AX | G.Skill Trident Z5 Neo RGB DDR5 32GB 6000MHz C30 | Sapphire PULSE Radeon RX 7900 XTX | Samsung 990 PRO 1TB with heatsink | Arctic Liquid Freezer II 360 | Seasonic Focus GX-850 | Lian Li Lanccool III | Mousepad: Skypad 3.0 XL / Zowie GTF-X | Mouse: Zowie S1-C | Keyboard: Ducky One 3 TKL (Cherry MX-Speed-Silver)Beyerdynamic MMX 300 (2nd Gen) | Acer XV272U | OS: Windows 11 |

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, 3rrant said:

However, the latency talk is not about audio but assets. There's overhead in each step of the process (that OP thankfully resumed in his edit). On PC, that latency stacks up by very small bits at a time but for many many operations before rendering the needed pixels. By cutting the processing units in the way and using direct access interfaces, latency is (theorically) extremely reduced.

The other thing is just the raw amount of CPU time you waist, and that means you have less time to do other stuff you need to do with the CPU.

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, hishnash said:

 If it is true that the PS5 has a dedicated hardware fixed function decompression solution...

What fixed function? Sure, the CPU is a custom ZEN 2 chip, but I'm genuinely curious as to what compression algorithm they picked to bake on CPU die. ZEN 2 only supports AVX-256 which helps any software based algo, but Intel so far is the only one that supports AVX-512 unless that too is now part of the custom PS5 die. Hmmmm 🤔

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, StDragon said:

What fixed function? Sure, the CPU is a custom ZEN 2 chip, but I'm genuinely curious as to what compression algorithm they picked to bake on CPU die. ZEN 2 only supports AVX-256 which helps any software based algo, but Intel so far is the only one that supports AVX-512 unless that too is now part of the custom PS5 die. Hmmmm 🤔

It might not be part of the CPU it could be part of the GPU (yes they are the same die). What is clear at least from Tims comments is that this is not doing de-compression in use-space cpu time. So it is not doing all of the extra hops through the kernal. Fixed function compression units are not un-common and i would not be suppressed is all Zen cpus have these for things. 

 

The real trick is this fixed function unit being able to read data directly from the SSD in a secure way. To do this I expect they are also using a partition disk format that is readable by things other than the OS. Or at least they have reduced this to a simple problem were it is easy for the kernel to produce a large table of data ranges on the SSD that relate to game data. 

 

On a PC even if you had direct PCIe access to the SSD from the GPU you would not be able to read any useful data since only the windows kernel fully understands the NTFS filesystem format, further more the windows kernel assumes it is the gate-keeper to this.

 

Remember on a PC due to the nature of it being an untrusted system where the OS kernal needs to assume everything that is going on is hostile until proven otherwise.

 

So for security reasons there would need to be some way to filter what part of the SSD the GPU can read/write to. For this you would need to create some mapping (i don't think this is possible at the PCIe bridge level it would need to be somehow within the SSD controller, a lookup table saying PCIe device X can read sectors A, C and D only.)

 

Then add to that the the Kernel thinks it is the only gatekeeper so when it comes to file system changes it does not expect someone else to be reading/writing to the disk. Fun concurrency pain here.


So assuming you have an SSD controller that can be `programmed` by the CPU (kernel) to grant access to a given set of files to anther PCIe device.  You then have the next issue, on a PC the gpu can be running mutliple things at once, so know you need some way for the SSD controller to know which subtask on the GPU is requesting access (and some way to ensure that one GPU task cant spoof another one).... 

 

 

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, hishnash said:

they have reduced this to a simple problem were it is easy for the kernel to produce a large table of data ranges on the SSD that relate to game data.

I'm not aware of any hardware that can read direct from a file system without first translating through the file system via kernel.

 

I suppose it could be raw via LBA. In fact, that's the entire point of the kernel when it comes to storage; to provide meaningful data structure and indexing from mapped LBA blocks. Meaning, there's going to be a table lookup at some point in software. No way around that.

 

The only other way possible (that I can think of anyways) is to treat a portion of the NAND flash like another form of addressable RAM.

Link to comment
Share on other sites

Link to post
Share on other sites

23 hours ago, RejZoR said:

It's not even about raw speed. It's about out of the box expectations. Developers know for a fact what storage speed they'll have on every single PS5 and they can code games to seamlessly stream assets during gameplay to a point they can load them directly through streaming system. Streaming basically means loading only content in visual field of the player and a bit more outside of field of view in a certain area and nothing else outside of that. Meaning they can create huge worlds without a single load screen and no limitations for the size of world itself and never experience a problem with it on any PS5.

 

On PC however, they can't know what kind of storage will be in use and what will be its speed, meaning they can't code the game that way and need to take slowest method (HDD) as the design point. Unless they can add a mechanism which would detect game residing on a stupid fast SSD and turn the streaming logic same way as it is on PS5. Question at that point is, is it even worth their time designing it this way or just doing it old way of loading all of it to memory with partial streaming?

You never know it could become a toggle setting that you can turn off and on at your own risk. 

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, StDragon said:

I'm not aware of any hardware that can read direct from a file system without first translating through the file system via kernel.

 

I suppose it could be raw via LBA. In fact, that's the entire point of the kernel when it comes to storage; to provide meaningful data structure and indexing from mapped LBA blocks. Meaning, there's going to be a table lookup at some point in software. No way around that.

 

The only other way possible (that I can think of anyways) is to treat a portion of the NAND flash like another form of addressable RAM.

I would expect (for security, Sony do not like people breaking into their hardware) that the kernal at game start would create a set of lookup tables mapping to blocks that are permitted to be read by the game. 

Another trick here would also be to ensure that you don't write to those blocks after having given a third party access. 

 

Link to comment
Share on other sites

Link to post
Share on other sites

There are protocols for direct access of ram and ssd by devices on pcie bus.  I saw an NVIDIA paper about reading blocks on other pcie devices on pcie bus without  going through cpu.  Could be something that only works in whatever OS HPC stuff is running on and not in windows or common linux distros.  Most of paper was over my head.

 

Games that use the ssd to full extend on next gen won't come out right at launch, pc ssd with similar controllers may be sold as aic by that time.  AMD has also put m.2 slots of gpu in past, obscure professional card.

 

Steaming being mentioned wasn't about online game streaming like stadia, it referred to streaming small bits of map/texture to gpu/cpu as needed instead of loading entire level into ram.  Most open world games stream the levels now, map is to big to put it all in ram.  In example of demo, loading all the details would be to much for ram, newer engine can stream only part for current fov, and remove things behind back to free up ram.  UE4 did this to some extend, it is a matter of degree, this shows large increase in ability to swap unneeded assets out of ram to make room.  UE3 on ps3, with its limited ram, you could sometimes see assets swap in and out with place holders.  

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Sophia_Borjia said:

There are protocols for direct access of ram and ssd by devices on pcie bus.

Yep the main issues for normal operating systems are:
1) is being able to know were on that device the data you need  (your GPU cant understand NTFS)
2) blocking PCIe devices from being able to read/write to any part of the disk (this is important for security, a game should not be able to change the window system files on disk).

add to that the game data is currently massively compresses and in most cases GPUs are not good at decompression (and it uses up your GPU time) so apparently they have a fixed function unit to do this.

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, Sophia_Borjia said:

 Most open world games stream the levels now, map is to big to put it all in ram.  In example of demo, loading all the details would be to much for ram...

Right. So the fundamental question posited is this - How does Sony plan on streaming the asset data from SSD to RAM via hardware based decompression all while avoiding traversal through the kernel (DMA)?

 

I agree with hishnash, Sony will be encrypting this data too. Is that too handled in hardware??

Link to comment
Share on other sites

Link to post
Share on other sites

18 minutes ago, StDragon said:

 Sony will be encrypting this data too. Is that too handled in hardware??

Yep i would expect so on the fly decryption is easy, most modern phones do this with very little overhead. 

Maybe the Decryption unit will be the only thing that can ref the SSD directly. And you need to set it from the kernel with the set of legal regions on the ssd. That dedicated pathway could also handle decryption securely as well.

 

This is a little like how Apples T2 chip (that is the SSD controller) handles encryption/decryption for the ssd and also handle what partitions on the ssd can be read by the OS (in this case it fully blocks regions of the ssd from being readable even by your kernel, apple do not trust x86 cpus these days).

Link to comment
Share on other sites

Link to post
Share on other sites

I still don’t see how this is going to manifest into any kind of performance gain vs a pc or Xbox. 

From my understanding of what he’s saying the ps5 has dedicated decompression hardware (even though zen is really good at decompression?) and less overhead when the gpu is trying to call assets from memory. This all sounds good but it doesn’t take into account the clear disadvantage such a system has vs a pc, that being that it’s system memory is gddr. My understanding is that the whole reason a pc has ddr(whatever) for system memory and the gpu has its own gddr(whatever) is that ddr is optimised for lower latency (which the cpu likes) and gddr is optimised for bandwidth (which the gpu likes). We also know that zen loves fast snappy memory and the ps5 has an 8 core zen2 cpu. Wouldn’t this work against the ps5 vs pc argument he’s making? I’m not saying he’s wrong he obviously has a lot more knowledge about this kind of thing than me it’s just confusing how he sites this optimisation and how’s it’s gonna make the ps5 better than a pc yet the ps5 lacks the memory optimisation a tradition pc has. Like sure this one thing is better than a pc but this other thing is clearly worse. Don’t you have to take into account the entire system to determine whether or not actual performance will be better?

Theres also the fact the devs would have to use this optimisation even if it theoretically improves performance devs have to code their game for at least  3 different platforms Xbox ps5 and pc. And if 2/3 lack this optimisation then how do you convince devs to utilise it? Unless of course it can some how do it without the devs needing to code specifically for it which would be really cool. It just seems to me like a marketing ploy more than anything else. I don’t see why Microsoft and Sony bother with this sort of talk. When consoles get released they already tend to be much better than the average pc due to up to date hardware, optimisation and economies of scale. Why bother with trying to focus on this feature which like 99% of the people don’t understand what he’s talking about because it’s a game console not an enterprise system, most of the people buying these things don’t understand any of the hardware anyway. Just show us some actual use cases instead of all this theoretical talk. 

Link to comment
Share on other sites

Link to post
Share on other sites

56 minutes ago, C2HWarrior said:

I still don’t see how this is going to manifest into any kind of performance gain vs a pc or Xbox. 

From my understanding of what he’s saying the ps5 has dedicated decompression hardware (even though zen is really good at decompression?) and less overhead when the gpu is trying to call assets from memory. This all sounds good but it doesn’t take into account the clear disadvantage such a system has vs a pc, that being that it’s system memory is gddr. My understanding is that the whole reason a pc has ddr(whatever) for system memory and the gpu has its own gddr(whatever) is that ddr is optimised for lower latency (which the cpu likes) and gddr is optimised for bandwidth (which the gpu likes). We also know that zen loves fast snappy memory and the ps5 has an 8 core zen2 cpu. Wouldn’t this work against the ps5 vs pc argument he’s making? I’m not saying he’s wrong he obviously has a lot more knowledge about this kind of thing than me it’s just confusing how he sites this optimisation and how’s it’s gonna make the ps5 better than a pc yet the ps5 lacks the memory optimisation a tradition pc has. Like sure this one thing is better than a pc but this other thing is clearly worse. Don’t you have to take into account the entire system to determine whether or not actual performance will be better?

Theres also the fact the devs would have to use this optimisation even if it theoretically improves performance devs have to code their game for at least  3 different platforms Xbox ps5 and pc. And if 2/3 lack this optimisation then how do you convince devs to utilise it? Unless of course it can some how do it without the devs needing to code specifically for it which would be really cool. It just seems to me like a marketing ploy more than anything else. I don’t see why Microsoft and Sony bother with this sort of talk. When consoles get released they already tend to be much better than the average pc due to up to date hardware, optimisation and economies of scale. Why bother with trying to focus on this feature which like 99% of the people don’t understand what he’s talking about because it’s a game console not an enterprise system, most of the people buying these things don’t understand any of the hardware anyway. Just show us some actual use cases instead of all this theoretical talk. 

Tim is probably a bit of a hardware geek. I don’t believe there's anything wrong with giving his opinion or showing excitement, even if most end users won’t understand any of it. 

My eyes see the past…

My camera lens sees the present…

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×