Jump to content

Programming - GPU intensive things?

Hi P
Go to solution Solved by Enderman,

Programming games and physics engines.

What kind of programming tasks would benefit from a powerful GPU?

 

GPU parallelism comes to mind, anything else?

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

Programming games and physics engines.

NEW PC build: Blank Heaven   minimalist white and black PC     Old S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Ryzen 3950X | AMD Vega Frontier Edition | ASUS X570 Pro WS | Corsair Vengeance LPX 64GB | NZXT H500 | Seasonic Prime Fanless TX-700 | Custom loop | Coolermaster SK630 White | Logitech MX Master 2S | Samsung 980 Pro 1TB + 970 Pro 512GB | Samsung 58" 4k TV | Scarlett 2i4 | 2x AT2020

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Hi P said:

What kind of programming tasks would benefit from a powerful GPU?

 

GPU parallelism comes to mind, anything else?

Anything where you have a massive amount of independent data pieces and you need to perform the same operation on all of them.

Link to comment
Share on other sites

Link to post
Share on other sites

@Mira Yurizaki is bang on. You want highly parallel operations that can be applied across a large data set. E.g. image analyses or rendering. 

 

Theres a reason GPU's are made the way they're. You want to push as many chunks of independent data as you can simultaneously and as fast as you can. If you're algorithms fit that description then it's probably a good fit.

 

 

 

 

CPU: Intel i7 - 5820k @ 4.5GHz, Cooler: Corsair H80i, Motherboard: MSI X99S Gaming 7, RAM: Corsair Vengeance LPX 32GB DDR4 2666MHz CL16,

GPU: ASUS GTX 980 Strix, Case: Corsair 900D, PSU: Corsair AX860i 860W, Keyboard: Logitech G19, Mouse: Corsair M95, Storage: Intel 730 Series 480GB SSD, WD 1.5TB Black

Display: BenQ XL2730Z 2560x1440 144Hz

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, trag1c said:

@Mira Yurizaki E.g. image analyses 

Could you please elaborate a bit on this? it caught my attention because it sounds relatively simple, is it though?

 

How does it work?

Link to comment
Share on other sites

Link to post
Share on other sites

32 minutes ago, Hi P said:

Could you please elaborate a bit on this? it caught my attention because it sounds relatively simple, is it though?

 

How does it work?

Let's take JPEG compression for example. You break up the image into 8x8 blocks (independent data sets) and perform two matrix operations on them to get the final result. The two matrix operations are the same for all of the 8x8 blocks that the image is compromised of. So if we have a UHD image (3840x2160) breaking this up into 8x8 blocks gives up about 129,600 blocks. So you have 129,600 blocks that need to have the same two operations performed on them.

Link to comment
Share on other sites

Link to post
Share on other sites

Machine learning, especially those with a deep neural network architecture can gain a lot of performance improvement from a good GPU. The reason is like what @Mira Yurizaki just said, GPU is designed to perform better than CPU when it comes to many large matrix operations.

Link to comment
Share on other sites

Link to post
Share on other sites

  • 3 weeks later...
On 6/19/2019 at 4:29 PM, Hi P said:

What kind of programming tasks would benefit from a powerful GPU?

 

GPU parallelism comes to mind, anything else?

 

 

Graph theoretic algorithms.  The number of simple computation grows quickly (sometimes NP-hard).  You need fast processing of simple computations in parallel.

Link to comment
Share on other sites

Link to post
Share on other sites

On 7/8/2019 at 7:57 PM, Eigencentrality said:

Graph theoretic algorithms.  The number of simple computation grows quickly (sometimes NP-hard).  You need fast processing of simple computations in parallel.

Are you sure? With graph theoretic algorithms you might need non-uniform control flow, which is a big no-no on GPUs.

 

Which is something others have brushed past, but its hecka important. For efficiencies sake, you need every invocation* to have the same control flow, because every invocation will take every path.

 

* An invocation is like a thread, but per work-item (in a graphical setting, these may be vertices or pixels, in GPGPU, these can be anything)

Link to comment
Share on other sites

Link to post
Share on other sites

5 minutes ago, Fourthdwarf said:

Are you sure? With graph theoretic algorithms you might need non-uniform control flow, which is a big no-no on GPUs.

 

Which is something others have brushed past, but its hecka important. For efficiencies sake, you need every invocation* to have the same control flow, because every invocation will take every path.

 

* An invocation is like a thread, but per work-item (in a graphical setting, these may be vertices or pixels, in GPGPU, these can be anything)

GPUs can still be faster for graph algorithms. In a sense, ray tracing is also a graph traversal algorithm (directed acyclic graph) and GPUs do pretty well there (compared to similarly priced CPUs).

Desktop: Intel i9-10850K (R9 3900X died 😢 )| MSI Z490 Tomahawk | RTX 2080 (borrowed from work) - MSI GTX 1080 | 64GB 3600MHz CL16 memory | Corsair H100i (NF-F12 fans) | Samsung 970 EVO 512GB | Intel 665p 2TB | Samsung 830 256GB| 3TB HDD | Corsair 450D | Corsair RM550x | MG279Q

Laptop: Surface Pro 7 (i5, 16GB RAM, 256GB SSD)

Console: PlayStation 4 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

32 minutes ago, mathijs727 said:

GPUs can still be faster for graph algorithms. In a sense, ray tracing is also a graph traversal algorithm (directed acyclic graph) and GPUs do pretty well there (compared to similarly priced CPUs).

I'm unfamiliar with any implementation of ray tracing that works purely by operating on an acyclic graph. I can see the argument that you produce a DAG in 3D space, but that's not a graph problem. RTX does use octrees, but it has specific hardware to accelerate those octrees, and this only accelerates raytracing (by culling objects) AFAIK. And BSP is also used to similar effect. But in both cases, you have acyclic graphs with relatively few edges, as opposed to well connected graphs with cycles, that may cause issues.

 

Also, it's only relatively recently that GPUs have outperformed CPUs in raytracing, partly because detecting intersections is difficult without breaking uniform control flow.

 

So yeah, some graph algorithms do work well on GPUs, but add cycles, backtracking, and other common graph algorithm issues or techniques, and you have something possibly much less suited to GPUs. GPUs might find a small advantage here, but nothing near what other kinds of problems can have.

Link to comment
Share on other sites

Link to post
Share on other sites

On 7/12/2019 at 11:14 AM, Fourthdwarf said:

I'm unfamiliar with any implementation of ray tracing that works purely by operating on an acyclic graph. I can see the argument that you produce a DAG in 3D space, but that's not a graph problem. RTX does use octrees, but it has specific hardware to accelerate those octrees, and this only accelerates raytracing (by culling objects) AFAIK. And BSP is also used to similar effect. But in both cases, you have acyclic graphs with relatively few edges, as opposed to well connected graphs with cycles, that may cause issues.

 

Also, it's only relatively recently that GPUs have outperformed CPUs in raytracing, partly because detecting intersections is difficult without breaking uniform control flow.

 

So yeah, some graph algorithms do work well on GPUs, but add cycles, backtracking, and other common graph algorithm issues or techniques, and you have something possibly much less suited to GPUs. GPUs might find a small advantage here, but nothing near what other kinds of problems can have.

RTX does not use octrees, it uses Bounding Volume Hierarchies (BVH) which have been the most popular acceleration structure in ray tracing for years. For simple scenes the BVH is a tree hence ray traversal = tree traversal. However when instancing comes into play a BVH node can have multiple parents so it turns into a DAG structure.

 

Also, GPUs have been outperforming (similarly priced) CPUs for years so I wouldn't call it something recent (before RTX GPUs were already much faster).

 

Ray traversal also requires back tracking (most commonly using a traversal stack) so that's not an argument. The only real difference between ray tracing and maybe some other graph traversal applications is the amount of computation that has to be done at each visited node (ray / bounding box intersections in the case of ray tracing). And graph traversal itself isn't that branch heavy either. You basically have the same operation (visiting a node) repeated in a while loop. Sure, selecting the next child node contains some branches but those are one-liners. For example in the case of ray tracing: if left child is closer than push right child to the stack first, otherwise push left child first. Computing which child is closest (and whether it is hit at all) is computationally intensive and not very branch heavy. A bigger issue with ray tracing is the lack of memory coherency which reduces the practical memory bandwidth on the GPU (having to load a cache line for each thread + the ith thread not always accessing the i*4th byte in a cache line).

 

Nvidia themselves also promote their GPUs as being much faster at graph analysis than CPUs:

https://devblogs.nvidia.com/gpus-graph-predictive-analytics/

Desktop: Intel i9-10850K (R9 3900X died 😢 )| MSI Z490 Tomahawk | RTX 2080 (borrowed from work) - MSI GTX 1080 | 64GB 3600MHz CL16 memory | Corsair H100i (NF-F12 fans) | Samsung 970 EVO 512GB | Intel 665p 2TB | Samsung 830 256GB| 3TB HDD | Corsair 450D | Corsair RM550x | MG279Q

Laptop: Surface Pro 7 (i5, 16GB RAM, 256GB SSD)

Console: PlayStation 4 Pro

Link to comment
Share on other sites

Link to post
Share on other sites

Basically, if you have large datasets and the operations you need to perform on them can be expressed in terms of linear algebra / matrix operations. For example, in machine learning and optimization problems you are often solving for the inverse of a matrix or matrices, GPUs are very good for this. Also in signal processing where many many fast fourier transforms and other transforms are being performed, this is very useful in real-time feedback control or computer vision. Astronomy that needs to be heavily processed also benefits from GPUs generally.

 

Another good example is something call FDTD (Finite Difference Time Domain) and FEA (Finite Element Analysis) type codes also benefit from having the parellelism that GPUs offer.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×