Jump to content

I am interested in what an alternative future for gaming could have looked like if multi-gpu set-ups hadn't been abandoned in favour of rtx enabled cards and machine learning solutions such as upscaling and frame generation. However I am not especially technically minded and want to get in contact with people who have in-depth knowledge of the hardware and software side, but could use some advice on where is the best place to try and meet them.

 

As someone who doesn't like wasting old hardware when upgrading, and is also aware that on the Linux side of things there has been great progress getting RTX only games running on non-rtx cards, it suggests there is a user-base that has the same reluctance to pursue new hardware. Also it shows that certain desires of Nvidia motivated them dropping support for things like SLI, as it clearly isn't the case that you need to have the newer cards rtx chip-sets to have those features. They are just perhaps a bit more performant in that regard.

 

As an example: Could you have SLI and have one gpu handling rasterisation, whilst the other handles ray-tracing? The emphasis on ray-tracing wasn't quite as prevalent at the time, so maybe that was never considered, but as a layman....it doesn't seem especially implausible to consider it. The approach to SLI, or what was on offer at the time in terms of support for the feature, didn't seem like it made the most of the extra hardware.

 

What I am curious to know about (if anyone reading this happens to be a bit of a mad scientist in either hardware design or programming) is what would your craziest (but still possible) approach be to making it possible for the end user to set up their own gaming environment? PC is , after all, modular in it's design, so if you could create a new piece of hardware, and have it supported by specific software programmed to enable it, what do you reckon the limits might be? Or perhaps it doesn't even need new hardware, but instead just takes new thinking and application of programming.

 

I have my own idea for something but lack any know-how as to whether it is feasible, either technically or in terms of programming it. But given that we have seen SLI between two different architectures of gpu (that wasn't possible initially) and rtx on non-rtx cards, as well as multi-gpu being used for machine learning and AI training but dropped for gaming, it would seem that a whole avenue of progression was taken off the table by Nvidia focusing on the AI element. 

 

If you have some thoughts on what that progression could still be, or can point me to people who do, I'd really like to hear from you, as I believe there could be a market for a potential solution. Whether that's likely is what I intend to find out.....

 

Link to post
Share on other sites

14 minutes ago, not ken kutaragi said:

As someone who doesn't like wasting old hardware when upgrading

There are 3 very easy solutions to solve this:

  1. Sell the old hardware. You might not need it, but likely someone else does.
  2. Repurpose it for a new PC. Say, get rid of your gramps' aging celeron that takes 5min to boot due to not even having an SSD.
  3. Use it for distributed computing.

Personally, I suggest you go down the 3rd option.

16 minutes ago, not ken kutaragi said:

Could you have SLI and have one gpu handling rasterisation, whilst the other handles ray-tracing?

Newer APIs like DX12 actually support mixed rendering, you could very much use an Nvidia + Intel GPU setup to render a game...

 

...as long as said game implements the feature, which no game does.

Want to help researchers improve the lives on millions of people with just your computer? Then join World Community Grid distributed computing, and start helping the world to solve it's most difficult problems!

 

Link to post
Share on other sites

Right now the best use is for LSFG to make FG and video output on the 2nd GPU while game renders on the primary GPU. This gives you very low latency (lower FG latency than NVIDIA MFG with Anti-Lag).

Works quite well actually.

Link to post
Share on other sites

@Imakuni I appreciate the response, but it's not really what I'm getting at. Yes, I could sell old hardware, but what I'm asking is: To those "in the know" (which I'm not) what could we be doing to re-purpose it?

 

EDIT: On the Nvidia side of things they are fairly reticient at providing open drivers etc, which is why Linux seems to have far more development where the individual can achieve flexibility. On that OS you have rtx games running on non-rtx hardware, so maybe one day someone will write a layer of interpretation that does SLI / crossfire / a hybrid of AMD and Nvidia where any game gets commands split up between gpus.

 

It doesn't exist yet, but the point is asking if it's possible now that there have been developments in DirectX 12 etc. You won't see game developers doing it, but that's not the same as saying it isn't doable. They just don't have the financial imperative to explore it. 

 

@WereCatLosslessScaling is a good example of someone not affiliated with graphics card manufacturers or "big-name" developers using multi-gpu to achieve a better measure of performance.

 

If they can have two gpu's communicating in this way (which wasn't how SLI was ever implemented, at least to my understanding of how Nvidia offered it within the development environment) then what is still left on the table for this kind of set-up?

 

Their implementation was either: one handles a frame to render, and the 2nd handles the next, which isn't twice the peformance you'd expect in terms of polygons / texturing but is instead a kind of alternating duplication. The other two options were: a) each draws part of the screen to make a whole, or b) one handles the anti-aliasing side of it (which is maybe what LosslessScaling taps into?) 

 

The example I gave was one card handling the graphics and a second handling ray-tracing, but since that isn't likely to be implemented by developers who are now focused on one gpu handling it all, it seems like it's going to take a bedroom coder to consider it. 

 

 

Link to post
Share on other sites

1 hour ago, not ken kutaragi said:

@Imakuni I appreciate the response, but it's not really what I'm getting at. Yes, I could sell old hardware, but what I'm asking is: To those "in the know" (which I'm not) what could we be doing to re-purpose it?

V

3 hours ago, Imakuni said:
  • Repurpose it for a new PC. Say, get rid of your gramps' aging celeron that takes 5min to boot due to not even having an SSD.
  • Use it for distributed computing.

Want to help researchers improve the lives on millions of people with just your computer? Then join World Community Grid distributed computing, and start helping the world to solve it's most difficult problems!

 

Link to post
Share on other sites

23 hours ago, not ken kutaragi said:

The example I gave was one card handling the graphics and a second handling ray-tracing, but since that isn't likely to be implemented by developers who are now focused on one gpu handling it all, it seems like it's going to take a bedroom coder to consider it.

That would only be beneficial if raster and RT computation could be done independetly in parallel, with little to no crosstalk, rather than interleaved. Otherwise latency and bandwidth limitations of the PCIe bus would eat up any potential benefit.

 

Even LS using two GPUs I would question the practical use in terms of power to performance. How much performance do you gain and how much additional power draw and heat output does that require?

Remember to either quote or @mention others, so they are notified of your reply

Link to post
Share on other sites

The new major technological leap in 3D graphics computation is 3D Gaussian Splatting rather than current rasterization techniques. I believe it certainly lends itself well to distributed work loads, but I don't think it will lead to multiple GPU setups. It's more advantageous to simply add more processing and VRAM onto a single card instead.

Link to post
Share on other sites

1 hour ago, Hideki Ryuga said:

I believe it certainly lends itself well to distributed work loads,

So does raster and RT, provided you have no issues with frame pacing.

 

GPUs are nothing but massively parallel processors optimized for graphics related operations (which also makes them suitable for other things that benefit from parallel computation).

 

Scaling graphics across multiple separate cards is held back by the need to keep them in sync. Multi-GPU is mainly usefull when operations don't need to run in lock step.

 

Even GPU chiplets suffer from latency between their interconnects and they are much closer together than multiple separate cards.

Remember to either quote or @mention others, so they are notified of your reply

Link to post
Share on other sites

11 hours ago, Eigenvektor said:

So does raster and RT, provided you have no issues with frame pacing.

 

GPUs are nothing but massively parallel processors optimized for graphics related operations (which also makes them suitable for other things that benefit from parallel computation).

 

Scaling graphics across multiple separate cards is held back by the need to keep them in sync. Multi-GPU is mainly usefull when operations don't need to run in lock step.

 

Even GPU chiplets suffer from latency between their interconnects and they are much closer together than multiple separate cards.

Except that you could solve the signal integrity/signal distance problem with better engineering. It would be very hard to convince me that the actual physical distance between cards is a real problem when the signals travel at at least 0.7c. The solution may be more expensive but certainly possible.

 

It's simply not economical for companies to do R&D on.

Link to post
Share on other sites

@EigenvektorQuestions that you asked are the reason why I wanted to speak to both people with engineering and programming knowledge. I understand that you have things to consider: Such as the heat generated by extra hardware and the power consumption.

 

Giving an example: I was still gaming with a 970 and an i5 4460. Upto the release of Doom Eternal I had a set-up that could deliver fairly close to a 60 fps 1080p experience. Fast forward to Doom The Dark Ages  and I can't run it at all due to the rtx requirements.

 

Over on Linux they have the same game running on roughly equivalent AMD  hardware that doesn't support rtx. In general you can see the action still hovering between 40 to 50fps and its only the rtx side of it that starts dragging that framerate down. Fundamentally the engine "evolution" doesn't come down to more polygons or higher texture detail etc but changes to lighting / reflections / 3d sound being handled by dedicated hardware.

 

With my old set-up of an integrated gpu and discrete gpu you always had X amount of potential power never being used. If the focus was on more polygons and better framerate, then wouldn't it be nice for anyone who has both types of gpu (as outdated as that has become) to be able to use every piece of hardware that is in their system.

 

This is assuming that cpu / gpu are separate when using an integrated chip, as I'm not in the know as to how it performs. But with 128mb of vram....I'm guessing it's not much utilised (or even at all) even though it's pretty capable of putting out a couple of million polygons. Whatever it's actually capable of, it's all something that could enhance the discrete gpu...in theory.

Link to post
Share on other sites

@Hideki RyugaThis is slightly off-topic, but just out of curiousity: I understand that nurbs aren't how gpu display graphics, but if your models were designed aroound nurbs and then turned into polygons through tesselation, would you ever need LODs?

 

Since nurbs are all mathematical data I wondered if they'd just scale up at any distance and then it becomes a question of how much detail they have when tesselated. Fair enough if you don't know, it's just that you mentioned other graphics technique

Link to post
Share on other sites

Yes, it's theoretically possible to use another GPU to do different parts of the graphics pipeline.

 

No, it's not really practical due to many different issues. Someone has done a similar topic couple months ago and it has been plenty discussed:

 

tldr;

- having 2 GPUs in your system is annoying and hard due to physical constraints, driver issues, and power supply requirements, so it doesn't make any sense for the great majority of folks

- keeping things in sync across the pipeline is a pain and ends up decreasing performance too much. Games are really latency sensitive, so it's hard to deliver a good experience this way

 

Multi-GPU systems are still widely used for compute, there's no issues or gotchas in this regard.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to post
Share on other sites

16 hours ago, Hideki Ryuga said:

Except that you could solve the signal integrity/signal distance problem with better engineering. It would be very hard to convince me that the actual physical distance between cards is a real problem when the signals travel at at least 0.7c. The solution may be more expensive but certainly possible.

I wouldn't discount distance entirely. While data moves at a significant fraction of light speed, as you said, that speed is effectively a constant. As such any latency introduced by signal propagation time scales purely based on distance.

 

But as you said, it likely isn't distance alone. Moving components closer together (from separate cards to chiplets) apparently still has latency issues. Though my suspicion is that we're looking at different causes that simply result in the same symptoms.

 

While chiplets on the same card are closer together, you now have to deal with shared resources (e.g. VRAM) which requires additional coordination. Effective work distribution and resulting core utilization is probably also more complex than it already is within a single chip.

 

16 hours ago, Hideki Ryuga said:

It's simply not economical for companies to do R&D on.

As far as I know, both AMD and Nvidia are looking into GPU chiplets because it promises lower cost (smaller chips, better yields) than single monolithic chips. So I would assume it is going to be economical to invest in it long term.

 

5 hours ago, not ken kutaragi said:

@EigenvektorQuestions that you asked are the reason why I wanted to speak to both people with engineering and programming knowledge.

Just to be clear, the first paragraph was a statement, not a question. RT and raster are both graphics and they cannot happen independently from one another. Splitting this work across two cards would require extensive communication between them.

 

5 hours ago, not ken kutaragi said:

Fast forward to Doom The Dark Ages  and I can't run it at all due to the rtx requirements.

RT requirement, not RTX requirement. RTX is simply Nvidia's branding for their implementation of hardware accelerated ray tracing. They did not invent RT, nor are they the first company to build dedicated hardware for it. They were simply the first to have mainstream commercial success with it.

 

5 hours ago, not ken kutaragi said:

Over on Linux they have the same game running on roughly equivalent AMD hardware that doesn't support rtx. In general you can see the action still hovering between 40 to 50fps and its only the rtx side of it that starts dragging that framerate down. Fundamentally the engine "evolution" doesn't come down to more polygons or higher texture detail etc but changes to lighting / reflections / 3d sound being handled by dedicated hardware.

Can you provide a source? Are you sure you're not confusing RTX and RT? AMD cards do have hardware RT acceleration since RDNA 2, it just isn't as performant as Nvidia's implementation. If they are doing this on RDNA 1 or older, I suspect this is some driver level software emulation running on shaders or similar.

 

Though I'm not sure how this ties into the whole multi-GPU topic.

 

5 hours ago, not ken kutaragi said:

This is assuming that cpu / gpu are separate when using an integrated chip, as I'm not in the know as to how it performs. But with 128mb of vram....I'm guessing it's not much utilised (or even at all) even though it's pretty capable of putting out a couple of million polygons. Whatever it's actually capable of, it's all something that could enhance the discrete gpu...in theory.

No, it really couldn't. Just calculating an arbitrary number of polygons isn't enough if you want to do graphics with a high enough and consistent frame rate. There's a reason multi-GPU generally requires two identical GPUs.

 

If your GPUs take turns rendering a frame (alternate frame rendering, AFR) one card can't be significantly more powerful than the other. That would result in wild swings in frame time. Even in identical GPUs with dedicated hardware (SLI bridges) frame pacing issues (micro stutters) are often an issue.

 

If each GPU renders some portion of the frame (split frame rendering, SFR), distribution of work between them becomes more complex the bigger the divide in their performance is. Easiest case, each one renders half of the image. But if the work required for one half (e.g. mostly empty sky) is much easier than the other (e.g. complex scenery) one of them will likely be done much earlier than the other. Which results in periods of time where one card is idle, waiting for the other to finish, effectively wasting available hardware resources.

 

This gets even more complex if you want to distribute load dynamically based on performance and scene complexity, because you effectively need to predict how long each card is going to take rendering its portion of the next frame. This prediction is unlikely to ever be perfect, so you'll constantly run into stalls were one card has to wait on the other, again wasting resources.

 

Furthermore, each card requires access to the same data. So both need the same amount of VRAM, but it's not shared between them. So two cards with 8 GB does not mean you now have 16 GB in total. You still only have 8 GB of usable VRAM. Which also means an iGPU with 128 MB is useless in this scenario, because it would constantly exceed its buffer, resulting in additional roundtrips to move data in and out of its dedicated memory pool.

 

There are scenarios where this kind of distribution is trivial (e.g. offline ray tracing), because each card can work at its own pace. If a card can only contributes 1% of work, you're still 1% faster. But that does not work for anything were results need to arrive at a constant pace and future output depends on data that isn't available yet (e.g. user input)

Remember to either quote or @mention others, so they are notified of your reply

Link to post
Share on other sites

22 hours ago, not ken kutaragi said:

@Hideki RyugaThis is slightly off-topic, but just out of curiousity: I understand that nurbs aren't how gpu display graphics, but if your models were designed aroound nurbs and then turned into polygons through tesselation, would you ever need LODs?

 

Since nurbs are all mathematical data I wondered if they'd just scale up at any distance and then it becomes a question of how much detail they have when tesselated. Fair enough if you don't know, it's just that you mentioned other graphics technique

I don't think it would erase the need for LODs as you can always reduce a more complex NURB into into a less complex NURB that could be approximated at sufficient distance, thereby freeing up processing power for other tasks.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×