Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Sign in to follow this  
Hi P

Programming - GPU intensive things?

Recommended Posts

Posted · Original PosterOP

What kind of programming tasks would benefit from a powerful GPU?

 

GPU parallelism comes to mind, anything else?

 

 

Link to post
Share on other sites

Programming games and physics engines.


My sound system costs more than my PC.        Check out my S340 build log "White Heaven"        The "LIGHTCANON" flashlight build log        Project AntiRoll (prototype)        Custom speaker project

Spoiler

Intel i7 4790k | ASUS GTX770 | ASUS Sabertooth Z97 Mark S | Corsair Vengeance Pro 32GB | NZXT S340 | Seasonic Platinum 760 | modded H100i | Ducky ONE White TKL RGB | Logitech MX Master 2S | 2x Samsung 850 Pro 512GB | WD Red 4TB Samsung 58" 4k TV | 2x Behringer NEKKST K8 | BIC Acoustech H-100II | Scarlett 2i4 | 2x AT2020

 

Link to post
Share on other sites

Scientific workloads.

 

Btw. You can only use a GPU if you can calculated the same operation on multiple items at the same time and read/write them in a "fast" structure to the RAM. If not use a CPU.

Link to post
Share on other sites

@Mira Yurizaki is bang on. You want highly parallel operations that can be applied across a large data set. E.g. image analyses or rendering. 

 

Theres a reason GPU's are made the way they're. You want to push as many chunks of independent data as you can simultaneously and as fast as you can. If you're algorithms fit that description then it's probably a good fit.

 

 

 

 


CPU: Intel i7 - 5820k @ 4.5GHz, Cooler: Corsair H80i, Motherboard: MSI X99S Gaming 7, RAM: Corsair Vengeance LPX 32GB DDR4 2666MHz CL16,

GPU: ASUS GTX 980 Strix, Case: Corsair 900D, PSU: Corsair AX860i 860W, Keyboard: Logitech G19, Mouse: Corsair M95, Storage: Intel 730 Series 480GB SSD, WD 1.5TB Black

Display: BenQ XL2730Z 2560x1440 144Hz

Link to post
Share on other sites
Posted · Original PosterOP
1 hour ago, trag1c said:

@Mira Yurizaki E.g. image analyses 

Could you please elaborate a bit on this? it caught my attention because it sounds relatively simple, is it though?

 

How does it work?

Link to post
Share on other sites
32 minutes ago, Hi P said:

Could you please elaborate a bit on this? it caught my attention because it sounds relatively simple, is it though?

 

How does it work?

Let's take JPEG compression for example. You break up the image into 8x8 blocks (independent data sets) and perform two matrix operations on them to get the final result. The two matrix operations are the same for all of the 8x8 blocks that the image is compromised of. So if we have a UHD image (3840x2160) breaking this up into 8x8 blocks gives up about 129,600 blocks. So you have 129,600 blocks that need to have the same two operations performed on them.

Link to post
Share on other sites

Machine learning, especially those with a deep neural network architecture can gain a lot of performance improvement from a good GPU. The reason is like what @Mira Yurizaki just said, GPU is designed to perform better than CPU when it comes to many large matrix operations.

Link to post
Share on other sites
On 6/19/2019 at 4:29 PM, Hi P said:

What kind of programming tasks would benefit from a powerful GPU?

 

GPU parallelism comes to mind, anything else?

 

 

Graph theoretic algorithms.  The number of simple computation grows quickly (sometimes NP-hard).  You need fast processing of simple computations in parallel.

Link to post
Share on other sites
On 7/8/2019 at 7:57 PM, Eigencentrality said:

Graph theoretic algorithms.  The number of simple computation grows quickly (sometimes NP-hard).  You need fast processing of simple computations in parallel.

Are you sure? With graph theoretic algorithms you might need non-uniform control flow, which is a big no-no on GPUs.

 

Which is something others have brushed past, but its hecka important. For efficiencies sake, you need every invocation* to have the same control flow, because every invocation will take every path.

 

* An invocation is like a thread, but per work-item (in a graphical setting, these may be vertices or pixels, in GPGPU, these can be anything)

Link to post
Share on other sites
5 minutes ago, Fourthdwarf said:

Are you sure? With graph theoretic algorithms you might need non-uniform control flow, which is a big no-no on GPUs.

 

Which is something others have brushed past, but its hecka important. For efficiencies sake, you need every invocation* to have the same control flow, because every invocation will take every path.

 

* An invocation is like a thread, but per work-item (in a graphical setting, these may be vertices or pixels, in GPGPU, these can be anything)

GPUs can still be faster for graph algorithms. In a sense, ray tracing is also a graph traversal algorithm (directed acyclic graph) and GPUs do pretty well there (compared to similarly priced CPUs).


Desktop: Corsair RM550x | Ryzen R7 2700X (upgrading to 3900X) | Corsair H100i (NF-F12 fans) | MSI GTX 1080 | 16GB 3000MHz memory | Asus X470-F Gaming | Samsung 970 EVO | Samsung 830 256GB SSD | 3TB HDD | Corsair 450D | MG279Q

Laptop: XPS15 9560 4K / 512GB

Console: Playstation 4 Pro

Link to post
Share on other sites
32 minutes ago, mathijs727 said:

GPUs can still be faster for graph algorithms. In a sense, ray tracing is also a graph traversal algorithm (directed acyclic graph) and GPUs do pretty well there (compared to similarly priced CPUs).

I'm unfamiliar with any implementation of ray tracing that works purely by operating on an acyclic graph. I can see the argument that you produce a DAG in 3D space, but that's not a graph problem. RTX does use octrees, but it has specific hardware to accelerate those octrees, and this only accelerates raytracing (by culling objects) AFAIK. And BSP is also used to similar effect. But in both cases, you have acyclic graphs with relatively few edges, as opposed to well connected graphs with cycles, that may cause issues.

 

Also, it's only relatively recently that GPUs have outperformed CPUs in raytracing, partly because detecting intersections is difficult without breaking uniform control flow.

 

So yeah, some graph algorithms do work well on GPUs, but add cycles, backtracking, and other common graph algorithm issues or techniques, and you have something possibly much less suited to GPUs. GPUs might find a small advantage here, but nothing near what other kinds of problems can have.

Link to post
Share on other sites
On 7/12/2019 at 11:14 AM, Fourthdwarf said:

I'm unfamiliar with any implementation of ray tracing that works purely by operating on an acyclic graph. I can see the argument that you produce a DAG in 3D space, but that's not a graph problem. RTX does use octrees, but it has specific hardware to accelerate those octrees, and this only accelerates raytracing (by culling objects) AFAIK. And BSP is also used to similar effect. But in both cases, you have acyclic graphs with relatively few edges, as opposed to well connected graphs with cycles, that may cause issues.

 

Also, it's only relatively recently that GPUs have outperformed CPUs in raytracing, partly because detecting intersections is difficult without breaking uniform control flow.

 

So yeah, some graph algorithms do work well on GPUs, but add cycles, backtracking, and other common graph algorithm issues or techniques, and you have something possibly much less suited to GPUs. GPUs might find a small advantage here, but nothing near what other kinds of problems can have.

RTX does not use octrees, it uses Bounding Volume Hierarchies (BVH) which have been the most popular acceleration structure in ray tracing for years. For simple scenes the BVH is a tree hence ray traversal = tree traversal. However when instancing comes into play a BVH node can have multiple parents so it turns into a DAG structure.

 

Also, GPUs have been outperforming (similarly priced) CPUs for years so I wouldn't call it something recent (before RTX GPUs were already much faster).

 

Ray traversal also requires back tracking (most commonly using a traversal stack) so that's not an argument. The only real difference between ray tracing and maybe some other graph traversal applications is the amount of computation that has to be done at each visited node (ray / bounding box intersections in the case of ray tracing). And graph traversal itself isn't that branch heavy either. You basically have the same operation (visiting a node) repeated in a while loop. Sure, selecting the next child node contains some branches but those are one-liners. For example in the case of ray tracing: if left child is closer than push right child to the stack first, otherwise push left child first. Computing which child is closest (and whether it is hit at all) is computationally intensive and not very branch heavy. A bigger issue with ray tracing is the lack of memory coherency which reduces the practical memory bandwidth on the GPU (having to load a cache line for each thread + the ith thread not always accessing the i*4th byte in a cache line).

 

Nvidia themselves also promote their GPUs as being much faster at graph analysis than CPUs:

https://devblogs.nvidia.com/gpus-graph-predictive-analytics/


Desktop: Corsair RM550x | Ryzen R7 2700X (upgrading to 3900X) | Corsair H100i (NF-F12 fans) | MSI GTX 1080 | 16GB 3000MHz memory | Asus X470-F Gaming | Samsung 970 EVO | Samsung 830 256GB SSD | 3TB HDD | Corsair 450D | MG279Q

Laptop: XPS15 9560 4K / 512GB

Console: Playstation 4 Pro

Link to post
Share on other sites

Basically, if you have large datasets and the operations you need to perform on them can be expressed in terms of linear algebra / matrix operations. For example, in machine learning and optimization problems you are often solving for the inverse of a matrix or matrices, GPUs are very good for this. Also in signal processing where many many fast fourier transforms and other transforms are being performed, this is very useful in real-time feedback control or computer vision. Astronomy that needs to be heavily processed also benefits from GPUs generally.

 

Another good example is something call FDTD (Finite Difference Time Domain) and FEA (Finite Element Analysis) type codes also benefit from having the parellelism that GPUs offer.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

Buy VPN

×