Jump to content

Forspoken PC Requirements Announced | RTX 4080 Recommended for 4K

AlTech
25 minutes ago, Kisai said:

That global illumination flag is something fixed by the 2021 patch. I'm not saying it was "perfectly optimized" before, but there was clearly no issue running it if you had something better than the recommended hardware. But the people on the discussion forums were somehow convinced this game should be running at 4kp60 on a GTX 1060, when it didn't even do that on a GTX 1080. Or somehow get 1080p60 on a GTX 1050Ti. It was clearly designed as a 1080p game from the get go, and the cutscene was the major clue that perhaps it didn't actually run at 1080p on the PS4.

 

Modders will have you believe that they are magically making high-requirement games better, when really they're just turning off, or dialing down features that break the game somewhere. Like with the global illumination, turning that off, makes various areas super-dark.

 

I played and finished Neir:automata without any mods, at 4K on the GTX 1080, and when it dipped below 60fps, I didn't care, but it was not a 1080p30 experience for most of game. I would certainly not stream the game on that GPU. I have not replayed it with the RTX 3070 or 3090, but just to explain again, "it's not optimized" is generally a complaint made by people trying to run a game at, or below the "minimum", because again, the "minimum" is always a lie.

 

The minimum is what you would need for a 720p30 experience at settings below where the ps5 would be, and if you are willing to play the game like that, I question your sanity of buying a $100 game to play on a 7 year old GPU.

 

You weren't playing it at 4k if you weren't using the FAR mod (Fix Automata Resolution) unless you played the patched version years later, as the game did not render at your native resolution without the FAR mod. When I tried to run it in a 2017 at 1080p without the FAR mod it looked like 720p or worse, super blurry. I also had the game on PS4 and the 900p it runs at on base PS4 looked way better than PC 1080p unmodded. Minimum for that game was GTX 770, I was running a GTX 970 that was about 65% faster than the 770 in games by then and had to use a fan made mod to make the game playable.

 

Never had any problems with it being too dark anywhere running global illumination at the medium setting through the FAR mod.

Link to comment
Share on other sites

Link to post
Share on other sites

9 minutes ago, Jon-Slow said:

After so many years of playing games, I have developed a sense to tell how much a game is going to suck from the marketing material, I've also played the demo tho. This is going to be one of the worst reviewed games of the year.

user reviews maybe, absolutely no chance for mainstream critics. looks fun enough for a $30-40 price though

Link to comment
Share on other sites

Link to post
Share on other sites

2 minutes ago, Eaglerino said:

user reviews maybe, absolutely no chance for mainstream critics. looks fun enough for a $30-40 price though

I agree with the fact that a lot of times reviews don't go below a certain score so that they don't face outrage from fanbois, or to just play it safe like IGN that gives any turd a 7 or an 8. However once in a while if the game is bad enough, like Babylon's Fall, everyone will rip into it. That's what I predict for this.

Link to comment
Share on other sites

Link to post
Share on other sites

40 minutes ago, SteveGrabowski0 said:

You weren't playing it at 4k if you weren't using the FAR mod (Fix Automata Resolution) unless you played the patched version years later, as the game did not render at your native resolution without the FAR mod. When I tried to run it in a 2017 at 1080p without the FAR mod it looked like 720p or worse, super blurry. I also had the game on PS4 and the 900p it runs at on base PS4 looked way better than PC 1080p unmodded. Minimum for that game was GTX 770, I was running a GTX 970 that was about 65% faster than the 770 in games by then and had to use a fan made mod to make the game playable.

 

Never had any problems with it being too dark anywhere running global illumination at the medium setting through the FAR mod.

Nah, the only time there was distinctly some lower-resolution in the game was during the pre-rendered cutscenes.

Link to comment
Share on other sites

Link to post
Share on other sites

7 hours ago, porina said:

Because I can't really see unified ram happening in a way where a dGPU holds all the system ram. GPU and CPU have to be closer together on mobo.

i actually meant the other way around,  gpus would simply not come with "vram" the ram is on the motherboard and can be expanded by the user (as it is now)

 

and i don't think it would be easy technically,  but doable. and i get the latency issue,  but there should be solutions for that ? maybe put the gpu right onto the ram or something... i don't have a technical solution for that,  im just saying theoretically unified ram would improve performance and be good for customers because they wouldn't be bound to whatever low ram amounts Nvidia decides to put on their GPUs...

 

3 hours ago, vetali said:

I felt like ranting about the backlash because I grew up when games that absolutely pushed top tier hardware were praised, but looked at some footage of the game... oof. It has no right demanding that hardware for how poorly it looks. Got that Skyrim unmodded water detail going on.

its a SE game, of course its gonna look ugly... 

 

yeah there are exceptions,  but very few of them ... even FF online while looking pretty good for what it is looks *old* like straight out of ps360 gen (because it probably is)

 

Nier Automata, another one, not made by SE internally however,  the first two "new" Tomb Raider games, same deal, but they could look better if you compare it to the fan remake, which completely blows them both out of the water... gotta say the first Tomb Raider was exceptionally well optimized though especially for looking that good,  for a ps360 gen game.

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Kisai said:

Nah, the only time there was distinctly some lower-resolution in the game was during the pre-rendered cutscenes.

The resolution was bugged and Square Enix didn't do a single update for years after launch. Maybe you didn't notice it playing at 4k but playing at 1080p it was crystal clear it was rendering subnative resolution and there was no way to change that in the option settings.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Mark Kaine said:

i actually meant the other way around,  gpus would simply not come with "vram" the ram is on the motherboard and can be expanded by the user (as it is now)

 

and i don't think it would be easy technically,  but doable. and i get the latency issue,  but there should be solutions for that ? maybe put the gpu right onto the ram or something... i don't have a technical solution for that,  im just saying theoretically unified ram would improve performance and be good for customers because they wouldn't be bound to whatever low ram amounts Nvidia decides to put on their GPUs...

Unified RAM works in APUs and laptops mostly because pieces are close by and in APU cases the GPU can benefit from the cache.

 

But to combat the need of high VRAM let me introduce "highly developed" and "modern" solution:

 

spacer.png

 

RAM-slot on GPU!

However VRAM quite fast got to where it's even today, much faster than RAM, as in GDDR is faster than DDR. Which is basicly why GPU with DDR is extremely low-end thing that shouldn't even exists [looks at Nvidia in disappointment].

 

However we also have better knowledge and much more computing power. Instead of doing as in the 90's and replacing the whole VRAM with changeable one we could create tiered VRAM where GDDR side handled the fast loads while added DDR stick would work as slow storage. We could easily get by using much smaller cells in games and storing bulk of assets in slower memory where it will be fetched (in comparison) immediately when the time comes, sprinkled with Resizable BAR, DirectStorage and whatever we could quite easily in the background transfer the next needed clusters of assets to the DDR memory, even all if there is enough space, and then just decode them directly to the GDDR memory for use.

 

We would get kind of extremely high VRAM capable GPUs with memory that doesn't cost 3 testicles and 8 kidneys, it would be expandable for future uses or just different uses where you might want less or more slow VRAM and those are also the reasons why we probably will never get them. After all currently it seems that the only reason to buy the newest GPU generation is just the VRAM amount, at least with Nvidia.

 

Which BTW brings more interesting question to the Forspoken specs, if the only reason for poor GPU performance would be the amount of VRAM between RTX 3070 8GB and RTX 4080 16GB, wouldn't RTX 3060 12GB then actually have possibility to perform better than RTX 3070 just because it has more VRAM?

Notice here, I am setting this question up from the point that if the only reason is the amount of VRAM, as in the only thing why RTX 3070 cannot run the Forspoken better than 1440p@30 is that it has too small VRAM, we would get the interesting situation where otherwise inferior RTX 3060 might push past bigger brother because that slower but more plenty VRAM.

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, Thaldor said:

RAM-slot on GPU!

right that would be a solution (theoretically) for one of the problems..  better than nothing for sure.

 

but as i explained above,  older consoles didn't have an APU but still unified RAM... i think that goes way back to 8bit/16bit consoles... they also often had 2 cpus etc...

 

PS3 benefitted hugely from unified RAM and that had 2 very distinct chips on the board with some space between. 

 

i think i see the problem now... you would probably need cpu and gpu slotted in right on the mobo, but again that seems more like a connection problem that could be overcome,  but basically, yeah, treating a gpu as a "card" instead of just another piece of silicone that can be directly mounted onto the motherboard maybe doesn't make too much sense... and it wouldn't need to be a card anymore either because there's no RAM, and power delivery could be right on the board too...

 

again,  not easy, but doable...

 

as said intel had a good chance to pull this off (they could make the appropriate motherboards for sure) but opted to play it safe instead.  Disappointing! 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Mark Kaine said:

i actually meant the other way around,  gpus would simply not come with "vram" the ram is on the motherboard and can be expanded by the user (as it is now)

If I understand the scenario you are proposing, the main GPU is still add in card, with no significant local VRAM, maybe some small cache at most? The unified ram is on mobo.

 

The biggest problem remains bandwidth, here in two locations.

1, getting enough ram bandwidth connected. The mobo gets the ram if the GPU doesn't? Let's use the 4070 Ti as an example here. It's just over 500GB/s. That's 20 channels of DDR4-3200. Move to DDR5, you maybe halved the number of modules. This would still be an insane quantity outside of a server. Other option then is to move to GDDR5 but this probably wont work well in a socket, so you're back to soldered and no upgradability.

2, getting the GPU connected to that bandwidth. PCIe 5.0 x16 is 64 GB/s, so you'd need EIGHT x16 slots worth of bandwidth. Good luck with that. Or some new insane interconnect that'll burn power like it is going out of fashion. In case the size of the problem isn't clear, you need a new connection that is an order of magnitude faster than a PCIe 5.0 x16 slot!

 

Above is why I assumed the bulk performance ram would remain connected to GPU, as it is currently. You don't have the problem of moving insane amounts of data around. The CPU would still need relatively much less bandwidth than the GPU so a smaller link could be retained to feed CPU. Even then, PCIe 5.0 x16 is still less bandwidth than we get from DDR5 already.

 

1 hour ago, Thaldor said:

However we also have better knowledge and much more computing power. Instead of doing as in the 90's and replacing the whole VRAM with changeable one we could create tiered VRAM where GDDR side handled the fast loads while added DDR stick would work as slow storage. We could easily get by using much smaller cells in games and storing bulk of assets in slower memory where it will be fetched (in comparison) immediately when the time comes, sprinkled with Resizable BAR, DirectStorage and whatever we could quite easily in the background transfer the next needed clusters of assets to the DDR memory, even all if there is enough space, and then just decode them directly to the GDDR memory for use.

This could be interesting. Say a hypothetical GPU had two DDR5 slots on it, which placed side on could fit in existing envelopes at the cost of a little reduced cooling potential. A pair of 5400 modules would give ~84GB/s bandwidth. Why that number? It would roughly match one of the GDDR chips on the 4070 Ti so it could be integrated as a "channel" like an existing chip, either instead of, or in addition to. So it could even boost performance a little when not using the extended capacity portion.

 

DDR5 on GPU would work better than DDR5 system ram already since it wouldn't be limited by PCIe bus. Even if we assume it supported PCIe 5.0 x16 since this would be a future GPU, that's still below what DDR5 provides.

 

1 hour ago, Thaldor said:

Which BTW brings more interesting question to the Forspoken specs, if the only reason for poor GPU performance would be the amount of VRAM between RTX 3070 8GB and RTX 4080 16GB, wouldn't RTX 3060 12GB then actually have possibility to perform better than RTX 3070 just because it has more VRAM?

Notice here, I am setting this question up from the point that if the only reason is the amount of VRAM, as in the only thing why RTX 3070 cannot run the Forspoken better than 1440p@30 is that it has too small VRAM, we would get the interesting situation where otherwise inferior RTX 3060 might push past bigger brother because that slower but more plenty VRAM.

Running out of VRAM is usually like falling off a performance cliff. Given that I'd suspect the 30fps part is separate from the VRAM part. The 3060 may be able to have better texture quality setting than the 3070 for example. 3060 could have more performance than 3070 if you target a setting that does need something in the 8-12GB range, but if you drop below 8GB 3070 would run away. It'll be interesting to see what the quality presets actually do when it is released.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Mark Kaine said:

but as i explained above,  older consoles didn't have an APU but still unified RAM... i think that goes way back to 8bit/16bit consoles... they also often had 2 cpus etc...

 

PS3 benefitted hugely from unified RAM and that had 2 very distinct chips on the board with some space between.

That era consoles also did have CPUs running in single digit Mhz speeds and especially with cartridge consoles we don't speak them actually using RAM as we today use it. As in NES did have 2KB of VRAM but that wasn't for assets but for the output and state, what we today call VRAM for NES was more a ROM chip on the cartridge. As in instead of the GPU fetching the assets from storage and storing them to the VRAM for rendering and use, the GPU has a slot for a memory chip that is the storage and becomes the ready-for-use part of the VRAM and the actual VRAM is just used for the generation and storing the end results.

 

PS3 still is odd piece of HW with the PowerPC architecture and the more ASIC like cores and whatever.

 

But when we get to the PCs the unified RAM has huge problems from which the main thing is speed and bus. Like if we take decade old GDDR4 and compare it to DDR4 we have somewhat comparable numbers, transfer rates are around 2-3 GT/s and data rates are around 17-20 GB/s and kind of same, but there really is no comparison between DDR4 and early GDDR6 because early GDDR6 had transfer rates in 14-16 GT/s and data rates in 112-128 GB/s.

As I said iGPUs use the cache of the APU to combat this problem. They have dedicated partition in the RAM for storage and then they leverage the cache to kind of already do the tiered VRAM where immediately needed stuff is transferred to the APU cache from the DDR RAM.

You could think that using solely GDDR would be the answer and PS5 and Xbox Series S/X already use only GDDR6 memory, but it isn't that simple. Both consoles do only have GDDR6 memory in them but it isn't "just" GDDR6, it's 2 banks of GDDR6 with 1 running faster and the other running slower bandwidth, for example XSX has 10GB of GDDR6@560GB/s and 6GB of GDDR6@336GB/s (Sony doesn't tell the ratio but only that they have GDDR6 memory but considering they have the almost same APU from AMD, they most likely also have same kind of setup within). The GDDR memory is faster and wastly higher bus than DDR but in trade it has more latency and one thing that really taxes CPU is latency and this is very much noticeable if we start to compare the earlier generation of consoles. For example PS4 was very known for running RDR2 "not that well" compared to Xbox One and PC and that wasn't because PS4 had better GPU and about the same CPU (which it had), it was probably almost solely for the optimization for the PS4 memory which was only GDDR5 compared to Xone with DDR3 (eSRAM on APU), the latency of the GDDR5 meant the PS4 had awful CPU performance which held it back a lot.

 

Note: While PS4 Pro moved to AMD Jaguar "Neo" with GDDR5 and DDR3 while XoneX moved to AMD Jaguar "Scorpio" with only GDDR5 as in opposed to the initial releases of PS4 with Jaguar "Liverpool" with GDDR5 only and Xone with Jaguar "Durango" with DDR3 and eSRAM. The situation was a bit different with AMDs new revisions to the Jaguar architecture and especially moving to Polaris based iGPU and I would guess the difference is in banking. The old AMD GCN2 based Jaguars didn't bank the GDDR5 memory which caused PS4 CPU to suffer from the latency issues, AMD may have changed this for the new revision and XoneX has banked GDDR5 with which the CPU doesn't suffer as much from the GDDR latency and so XoneX didn't get the PS4 issues. At the same time Sony fixed their issues by adding the DDR3 memory for the CPU in the PS4 Pro.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, porina said:

1, getting enough ram bandwidth connected. The mobo gets the ram if the GPU doesn't? Let's use the 4070 Ti as an example here. It's just over 500GB/s. That's 20 channels of DDR4-3200. Move to DDR5, you maybe halved the number of modules. This would still be an insane quantity outside of a server. Other option then is to move to GDDR5 but this probably wont work well in a socket, so you're back to soldered and no upgradability.

That sort of bandwidth it possible with LPDDR5 at low capacities. Apple is already offering 400GB/s and 800GB/s options in the M1 generation, the first one being only 4 packages at a 512bit bus width.

 

Yea that's well wider than any consumer CPU does currently but it can be done. You point about PCIe bandwidth is the big killer here along with latency having memory so far from the GPU on an add-in card in a PCIe slot.

 

Also I don't think sharing the memory is realistically possible without the CPU and GPU dies sharing the same memory controller interface. Sure technologies like CXL exist but that's not a replacement for main memory.

 

I haven't really follow the conversation, only read this one comment from you, but if it's what I think it is I can't see GPU add-in cards without memory being feasible basically ever.

Link to comment
Share on other sites

Link to post
Share on other sites

25 minutes ago, porina said:

If I understand the scenario you are proposing, the main GPU is still add in card, with no significant local VRAM, maybe some small cache at most? The unified ram is on mobo

yes, basically. like it is traditionally on consoles.

i mean i acknowledged its not easy, you need really fast connections,  and fast ram, DDR5 isnt going to cut it.

And sure faster RAM sounds expensive and it would be initially,  but prices would drop to normal levels over time, hopefully.  i mean that's another thing i noticed, pcs should really have the fastest RAM available,  so basically what's on GPUs now... PS5 does that for example afaik?

That is also one of the advantages of unified memory,  you just use the fastest memory available not just some good enough office stuff : D

 

Oh and meanwhile I also think the gpu chip should probably slot right into the motherboard for better connectivity  - and at that point you're right, i don't know why we would still need AIBs? they could still make coolers (those GPUs will still need to be cooled obviously) but that's probably not as lucrative. 

 

 

i do think this will happen anyways,  at the latest when APUs are just as fast as "GPUs on cards" or when making traditional  GPU add in cards just isnt feasible anymore (i expect this to happen within 10 years, could  be wrong of course) its just not a very efficient way to do it like that long term,  but some companies with foresight could jump on that wagon earlier possibly. 

 

25 minutes ago, porina said:

DDR5 on GPU would work better than DDR5 system ram already since it wouldn't be limited by PCIe bus. Even if we assume it supported PCIe 5.0 x16 since this would be a future GPU, that's still below what DDR5 provides.

no, no, 64GB unified GDDR6X on the board (or wherever it fits best honestly,  the point is it needs to be upgradable and unified)  think big = )

 

 

 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

29 minutes ago, Thaldor said:

As I said iGPUs use the cache of the APU to combat this problem. They have dedicated partition in the RAM for storage and then they leverage the cache to kind of already do the tiered VRAM where immediately needed stuff is transferred to the APU cache from the DDR RAM.

but couldn't the gpu just have some cache ... if that's what's needed then that seems like a nobrainer? 

 

"3D cache" there, sells by itself!  : p

 

 

24 minutes ago, leadeater said:

haven't really follow the conversation, only read this one comment from you, but if it's what I think it is I can't see GPU add-in cards without memory being feasible basically ever.

yeah, i changed my mind on that, i don't see why it needs to be an add in card, it would just be like a second CPU you slot into the motherboard,  with aftermarket coolers...

 

the difference to consoles would be it wouldn't be soldered on, i don't think that makes a big difference? 

 

ps: yeah, i realize you didn't quote me, but it was my idea to have unified RAM and then the question was about add in cards, and yeah, after thinking about it, add in cards wouldn't make sense in that scenario,  for connectivity reasons mostly (latency)

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

10 minutes ago, Mark Kaine said:

but couldn't the gpu just have some cache ... if that's what's needed then that seems like a nobrainer? 

 

"3D cache" there, sells by itself!  : p

It can but it would likely need so much you may as well put main memory on the card. The slower and higher the latency of the remote memory the more cache you'd need. Cache isn't really getting much more dense unlike logic so you'd have really expensive dies with not much actual GPU execution units.

 

I don't think a suitable balance could be achieved, not through a PCIe bus.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, leadeater said:

The slower and higher the latency of the remote memory the more cache you'd need

yeah, that is the main argument against unified RAM it seems, and i dont know enough about it, but i think with the right design that could be overcome,  that means no add in cards, the gpu is slotted into the motherboard...RAM could be literally between the cpu and gpu... not sure honestly,  but it's what i said right away they would need to find ways latency simply isnt an issue,  hence put everything on the motherboard seems possible?

 

 

ps: you could still put the RAM around the GPU, just like on a card... no latency issues (it would still need a small cache probably)! And CPU doesn't seem to care as much? i don't know, it just seems possible to me ¯\_(ツ)_/¯ 

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

26 minutes ago, Mark Kaine said:

yeah, that is the main argument against unified RAM it seems, and i dont know enough about it,

Unified is fine, it's how it's done that matters. Apple M1/M2 is an example of feasible, consoles another, Steam Deck and others like it etc. Where unified works currently is when the CPU and GPU are sharing the same memory controller connecting to the same memory and are either within the same die or on the same package with a very high speed die interconnect.

 

What you are talking about is very similar to the Xbox 360. That had unified memory and the GPU package had the memory controller and was connected directly to the memory. There was an interconnect to the FSB for the CPU and the CPU had to access memory through the GPU memory controller. The bandwidth between the GPU and CPU was 10GB/s. While we can do much faster interconnect today the difference between those and memory bandwidth is now many many times greater.

 

So if you want to do as you suggest the GPU will likely have to be the central part of the system and the CPU remote rather than the other way round which you are more talking about. The CPU would still end up greatly bandwidth and latency starved though.

Link to comment
Share on other sites

Link to post
Share on other sites

15 minutes ago, leadeater said:

Unified is fine, it's how it's done that matters. Apple M1/M2 is an example of feasible, consoles another, Steam Deck and others like it etc. Where unified works currently is when the CPU and GPU are sharing the same memory controller connecting to the same memory and are either within the same die or on the same package with a very high speed die interconnect.

 

What you are talking about is very similar to the Xbox 360. That had unified memory and the GPU package had the memory controller and was connected directly to the memory. There was an interconnect to the FSB for the CPU and the CPU had to access memory through the GPU memory controller. The bandwidth between the GPU and CPU was 10GB/s. While we can do much faster interconnect today the difference between those and memory bandwidth is now many many times greater.

 

So if you want to do as you suggest the GPU will likely have to be the central part of the system and the CPU remote rather than the other way round which you are more talking about. The CPU would still end up greatly bandwidth and latency starved though.

Hmm, i see, so basically this would work but CPU and GPU would need to share the memory controller and possibly be on the same die (but the latter doesn't seem to be absolutely necessary)?

 

If that's the case that seems unlikely now, but in the future, AMD, Intel, maybe Nvidia could pull something like this off?

The direction tells you... the direction

-Scott Manley, 2021

 

Softwares used:

Corsair Link (Anime Edition) 

MSI Afterburner 

OpenRGB

Lively Wallpaper 

OBS Studio

Shutter Encoder

Avidemux

FSResizer

Audacity 

VLC

WMP

GIMP

HWiNFO64

Paint

3D Paint

GitHub Desktop 

Superposition 

Prime95

Aida64

GPUZ

CPUZ

Generic Logviewer

 

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, leadeater said:

That sort of bandwidth it possible with LPDDR5 at low capacities. Apple is already offering 400GB/s and 800GB/s options in the M1 generation, the first one being only 4 packages at a 512bit bus width.

I neglected LPDDR possibly never having been exposed to it. Trying to read up on it now, the way it is specified doesn't make sense to me.

 

Looking at the M1 offerings as an example, I'd summarise the technology reminds me a bit of HMB, in the sense it is dense and high bandwidth. Where it differs from HBM is that LPDDR is narrow(er) and faster, whereas HBM is fat and slow. So LPDDR should be preferable in CPU use cases where access times may be more important than on a GPU. Still, in real world examples of LPDDR and HBM, both are often physically located very close to where it is being used. I can only assume their promised speeds wont be realisable at distance. GDDR might stretch a bit further, and DDR further still. 

 

3 hours ago, Mark Kaine said:

no, no, 64GB unified GDDR6X on the board (or wherever it fits best honestly,  the point is it needs to be upgradable and unified)  think big = )

You're dreaming, at least I can't see this happening in near generations, if ever. I still don't think you fully grasp the challenges this wish list comes with. There will have to be tradeoffs and you're unlikely to see it all in the way you've described. You want performance, upgradability. Can it be technically done? Maybe. But not in something that makes any sense at a consumer level.

 

IMO if x86 were to go into high performance unified memory, I see it happening like existing products like Apple and consoles. Core parts (CPU, GPU, ram) will be integrated. Zero upgradability. Maybe the lower tier expandable ram described previously could give some expansion hope. User replaceable high performance memory in a unified memory system just isn't going to happen.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

13 minutes ago, porina said:

IMO if x86 were to go into high performance unified memory, I see it happening like existing products like Apple and consoles. Core parts (CPU, GPU, ram) will be integrated. Zero upgradability. Maybe the lower tier expandable ram described previously could give some expansion hope. User replaceable high performance memory in a unified memory system just isn't going to happen.

What I expect, at some point is Intel to go "hey we can get M2/M3 performance too" and do exactly what Apple did but with a chiplet design that has the CPU and iGPU as a full size dGPU die with the system memory and GPU memory on top of the chip. So instead of a crappy iGPU you end up with a full fledged dGPU die sitting next to the CPU die on it's own interconnect.

 

That said, Intel has never, EVER, shown reasonable performance from an iGPU and I doubt they would ever be able to sell such a chip to anyone except Sony/Microsoft for a future console. Microsoft was already burned once by Intel with the original Xbox, so I don't see that happening, and both Sony and Microsoft are "no thanks" at nVidia (now it's Nintendo's turn to get burned by Nvidia.)

 

So maybe in a laptop design that might happen, but I don't see a desktop ever going in this direction. It's more likely they would utilize a multi-CPU chipset platform, but instead put the GPU into the second socket with multiple DIMM channels's to widen the bandwidth. Expensive to design, but probably the closest we would see on a desktop without completely abandoning expandable RAM. If we were to abandon the RAM expandability, both the CPU and GPU would have to have the RAM on top of the CPU, and I just don't see that being viable for the space-heater TDP's of Intel, AMD and Nvidia chips.

Link to comment
Share on other sites

Link to post
Share on other sites

52 minutes ago, Kisai said:

What I expect, at some point is Intel to go "hey we can get M2/M3 performance too" and do exactly what Apple did but with a chiplet design that has the CPU and iGPU as a full size dGPU die with the system memory and GPU memory on top of the chip. So instead of a crappy iGPU you end up with a full fledged dGPU die sitting next to the CPU die on it's own interconnect.

Upcoming Meteor Lake is going in that direction. The memory will remain external but Intel will have flexibility to scale cores and iGPU dies separately. But I'm not aware of any moves away from existing DDR approach, yet.

 

54 minutes ago, Kisai said:

That said, Intel has never, EVER, shown reasonable performance from an iGPU and I doubt they would ever be able to sell such a chip to anyone except Sony/Microsoft for a future console.

Mobile offerings since Tiger Lake haven't been that different from AMD equivalents, at least the ones in DDR4 era anyway. To step up a level it does require moving away from system DDR.

 

54 minutes ago, Kisai said:

So maybe in a laptop design that might happen, but I don't see a desktop ever going in this direction. It's more likely they would utilize a multi-CPU chipset platform, but instead put the GPU into the second socket with multiple DIMM channels's to widen the bandwidth.

Socket to socket bandwidths suck, same problem AMD has since they went chiplet with Infinity Fabric. Why 3DCacahe exists.

 

54 minutes ago, Kisai said:

If we were to abandon the RAM expandability, both the CPU and GPU would have to have the RAM on top of the CPU, and I just don't see that being viable for the space-heater TDP's of Intel, AMD and Nvidia chips.

Mainly 2D tile structures are being used by both AMD and Intel already. Third dimension can be reserved for less power limiting uses.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

 

8 minutes ago, porina said:

Socket to socket bandwidths suck, same problem AMD has since they went chiplet with Infinity Fabric. Why 3DCacahe exists.

That's just one option, and the core problem is always latency. Maybe someday we'll get those optical nanotube chips.

 

8 minutes ago, porina said:

Mainly 2D tile structures are being used by both AMD and Intel already. Third dimension can be reserved for less power limiting uses.

Considering that the operating temperature of the Intel/AMD/Nvidia CPU's and GPU's are at or above 95C degree's and the RAM chips have a maximum operating temperature of 85C. Those CPU/GPU TDP's need to come down a LOT.

Link to comment
Share on other sites

Link to post
Share on other sites

18 hours ago, AluminiumTech said:

That would please the 4 people with RTX 4080s but what about everybody else?

 

4090s sold like 160000 in a month I'd think 4080s would have sold at least half that in double the time.  IE "4" vs 80000 I think that is quite the exaggeration.

AMD 7950x / Asus Strix B650E / 64GB @ 6000c30 / 2TB Samsung 980 Pro Heatsink 4.0x4 / 7.68TB Samsung PM9A3 / 3.84TB Samsung PM983 / 44TB Synology 1522+ / MSI Gaming Trio 4090 / EVGA G6 1000w /Thermaltake View71 / LG C1 48in OLED

Custom water loop EK Vector AM4, D5 pump, Coolstream 420 radiator

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, ewitte said:

4090s sold like 160000 in a month I'd think 4080s would have sold at least half that in double the time.  IE "4" vs 80000 I think that is quite the exaggeration.

A) They sold so badly that most of them are still on shelves not having been sold or were bought and returned, and B) My point still stands, RTX 4080 owners are not a large portion of PC Gamers.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

On 1/21/2023 at 4:33 AM, AluminiumTech said:

A) They sold so badly that most of them are still on shelves not having been sold or were bought and returned, and B) My point still stands, RTX 4080 owners are not a large portion of PC Gamers.

That is why I divided the 4090 sales by 4.  The 4090 has been selling through pretty quickly.  It's not OOS long but they aren't just sitting on the shelfs.

AMD 7950x / Asus Strix B650E / 64GB @ 6000c30 / 2TB Samsung 980 Pro Heatsink 4.0x4 / 7.68TB Samsung PM9A3 / 3.84TB Samsung PM983 / 44TB Synology 1522+ / MSI Gaming Trio 4090 / EVGA G6 1000w /Thermaltake View71 / LG C1 48in OLED

Custom water loop EK Vector AM4, D5 pump, Coolstream 420 radiator

Link to comment
Share on other sites

Link to post
Share on other sites

16 hours ago, ewitte said:

That is why I divided the 4090 sales by 4.  The 4090 has been selling through pretty quickly.  It's not OOS long but they aren't just sitting on the shelfs.

They literally are though. The few  people who buy them are returning them.

Judge a product on its own merits AND the company that made it.

How to setup MSI Afterburner OSD | How to make your AMD Radeon GPU more efficient with Radeon Chill | (Probably) Why LMG Merch shipping to the EU is expensive

Oneplus 6 (Early 2023 to present) | HP Envy 15" x360 R7 5700U (Mid 2021 to present) | Steam Deck (Late 2022 to present)

 

Mid 2023 AlTech Desktop Refresh - AMD R7 5800X (Mid 2023), XFX Radeon RX 6700XT MBA (Mid 2021), MSI X370 Gaming Pro Carbon (Early 2018), 32GB DDR4-3200 (16GB x2) (Mid 2022

Noctua NH-D15 (Early 2021), Corsair MP510 1.92TB NVMe SSD (Mid 2020), beQuiet Pure Wings 2 140mm x2 & 120mm x1 (Mid 2023),

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×