Jump to content
  • entries
  • comments
  • views

Discussing Myths: GeForce RTX. The Whole Thing

Mira Yurizaki



On this entry, GeForce RTX! And not just one thing but the whole thing. Or at least as much as possibly can remember. As with the Discussing Myth series, it's more about taking assertions or claims that people have said and poking at it, presenting an argument against the assertion or claim.


RTX is only about ray tracing

Perhaps the "RT" in "RTX" makes it confusing, but RTX is a technology feature set that encompasses what Turing can do. In addition to hardware accelerated ray tracing, it also encompasses DLSS.


DXR was something only NVIDIA collaborated with Microsoft to create

AMD contributed as well, stating even they were "working closely with Microsoft": https://hexus.net/tech/news/graphics/116354-amd-nvidia-working-closely-microsoft-dxr-api/


Ray tracing hardware takes up a lot of space on the die

The first thing to note is we don't really have anything resembling a die shot of Turing other than this:



Note that this is from NVIDIA themselves. But if we were to try to map it to the block diagram:



And pretend that the die shot sort of resembles this, then figuring out where the RT core is, it's probably the thing highlighted in red



That's not a whole lot of space. Some quick and dirty math against the area that the red highlighted area takes up vs. the entire image comes out to about 3% of the space.


BUT, the thing to remember nobody really verified if the die shot NVIDIA provided is the actual die of Turing.


RTX is pointless because ray tracing can be done "in software"

By "in software", this means that dedicated hardware for ray tracing isn't necessary. But of course ray tracing can be done without hardware acceleration. Anything can be done without hardware acceleration. But then it's missing the point: "acceleration." RTX can do ray tracing faster. Though there is one caveat to this: the application has to explicitly give the "hint" that it wants to use hardware acceleration. Otherwise to the GPU, it looks like some other generic workload.


So at this point, I've yet to see a direct comparison of the same application using ray tracing that targets both hardware accelerated RT and software RT. From what I can find, it's usually one or the other. What's most damning is AMD for whatever reason won't even enable the DXR fallback layer on their cards, even though there's nothing that prevents those cards from running it.


As an aside, one of the foundations of modern GPUs today, a feature called hardware transform and lighting (hardware T&L), met with similar resistance. 3dfx infamously said that hardware T&L was not necessary as long as you had a fast enough CPU. And for the most part, at least for games at the time, they were right:


However once games started taking advantage of hardware T&L, 3dfx's offerings were no longer looking all that great: https://www.anandtech.com/show/580/6


DLSS is a super sampler/image quality improver

To poke at the first point, it's not a super sampler. Tom's Hardware did an analysis of what's going on behind the scenes and found out that the DLSS renders internally at a lower resolution, then upscales it. This is not the definition of super sampling, which is to make more sample points than needed for a given point. The only "super sampling" part about it is that the so-called "ground truth" images used to train the AI are super sampled.


However, to call it an image quality improver is also, I would argue, not correct. Something that attempts to improve image quality I would argue retains either the same or lower resolution of the original source image. Since DLSS is an upscaler, it fails to meet that criteria.



1 Comment

So while trying to figure what exactly some circuit that could accelerate Ray Tracing looks to respond to your points about die area, I came across this article: What you need to know about ray tracing and NVIDIA's Turing architecture.

That article states:


"The crux of the matter is something called BVH traversal, short for Bounding Volume Hierarchy. This is basically a method for optimizing intersection calculations, where objects are bounded by larger, simpler volumes." 

"NVIDIA's solution is to have the Turing RT cores handle all the BVH traversal and ray-triangle intersection testing, which saves the SMs from spending thousands of instruction slots per ray.

The RT cores comprises of two specialized units. The first carries out the bounding box tests, while the second performs ray-triangle intersection tests and reports on whether it's a hit or not back to the SM. This frees up the SM to do other graphics or compute work. "

So, that narrowed down my parameters. I started a search for "bounding volume hierarchy traversal circuit" and came across an Intel patent filed for in 2012: Graphics tiling architecture with bounding volume hierarchies


In part, the patents abstract states:


" In some embodiments, tile lists may be avoided by storing the geometry of a scene in a bounding volume hierarchy (BVH). For each tile, the bounding volume hierarchy is traversed. The traversals continued only into children nodes that overlap with the frustum on the tile. By relaxing the ordering constraint of rendering primitives, the BVH is traversed such that nodes that are closer to the viewer are traversed first, increasing the occlusion culling efficiency in some embodiments. "

That sounds awfully similar to some official statements that have been made about how the RT Cores work. Reading this patent may be worthwhile as an introduction to the subject.

Additionally, this newer Samsung patent is more in depth and even more similar to how Nvidia has claimed RT cores work: https://patentimages.storage.googleapis.com/ac/e4/e1/ad9e4d9b32502a/US10049488.pdf


A method of traversing an acceleration structure ( AS ) in a ray tracing system includes obtaining information about child nodes of a target node included in the AS ; determining whether each of the child nodes intersects a ray based on the obtained information , determining a next target node among at least one child node that intersects the ray ; and performing an operation corresponding to a type of the determined next target node

Sounds even more similar to the claimed "BVH search and report an intersection hit to the SM".

Link to comment
Link to post