Jump to content

Beginner to GPU compute programming

porina

I have an idea for a GPU compute task. The difficulty lies in that I'm totally out of date with modern programming options, and how a GPU actually looks from a programming perspective. The only language I've used in anger is vanilla C. No ++. No #. I have looked looked at game engine programming and it feels like I'm missing a lot from modern languages, which if anything, should make things a lot easier once I get used to it.

 

So, what are the easiest options to get started? Any pointers welcome, as long as they're not like pointers in C.

  • Windows environment
  • Only needs to run on nvidia GPUs/CUDA.
  • I don't mind learning another language beyond C. It looks like CUDA can work with C, C++, Python and more, but I think those will be the main possibilities for me. Learning Python sounds more attractive to me than remembering how to use C. 
  • Application will be console/command line only. I don't need/want to worry about a GUI.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

The limited experience I have around GPU programming is through opencv, which is a computer vision / linear algebra package. You can construct cv::cuda::GpuMat objects to create matrices in vram, and operations on these objects will occur on GPU.

This is a signature.

Link to comment
Share on other sites

Link to post
Share on other sites

CUDA is made for C or C++. There was another competition but it died a while ago called AleaGPU which was just the CUDA libs wrapped for C# so that it could be used in normal day to day business app, website, etc. It was hella fast specially for large data and looping functions. The team unfortunately stopped updating the core so it lost support for newer cards.

 

I never tried the C version but I have used successfully the C++ and C# quite a lot.

Link to comment
Share on other sites

Link to post
Share on other sites

3 hours ago, Franck said:

CUDA is made for C or C++. There was another competition but it died a while ago called AleaGPU which was just the CUDA libs wrapped for C# so that it could be used in normal day to day business app, website, etc. It was hella fast specially for large data and looping functions. The team unfortunately stopped updating the core so it lost support for newer cards.

 

I never tried the C version but I have used successfully the C++ and C# quite a lot.

Nvidia has their own Python wrapper and endorse libraries such as Numba:

https://developer.nvidia.com/cuda-python

 

So using C, C++, Fortran or Python is pretty much a personal decision, CUDA itself should work with any of those without problems.

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

  • 4 weeks later...
On 7/4/2023 at 6:06 PM, igormp said:

Nvidia has their own Python wrapper and endorse libraries such as Numba:

https://developer.nvidia.com/cuda-python

 

So using C, C++, Fortran or Python is pretty much a personal decision, CUDA itself should work with any of those without problems.

Python for low-level hardware programming? Isn't it a bit too abstract for proper learning and development in the nitty-gritty parts of the subject?

There is approximately a 99% chance I edited my post

Refresh before you reply

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Timme said:

Python for low-level hardware programming?

It's just a glue while you code CUDA kernels separately, the same with other aspects of GPU computing.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, Timme said:

Python for low-level hardware programming? Isn't it a bit too abstract for proper learning and development in the nitty-gritty parts of the subject?

Well, OP wants to accomplish a task, not necessarily learn how to do proper gpu programming nor do low level hardware programming (by at which point you could argue that only asm/shaders apply). 

25 minutes ago, riklaunim said:

It's just a glue while you code CUDA kernels separately, the same with other aspects of GPU computing.

You don't need to hand write those Cuda kernels tho. 

FX6300 @ 4.2GHz | Gigabyte GA-78LMT-USB3 R2 | Hyper 212x | 3x 8GB + 1x 4GB @ 1600MHz | Gigabyte 2060 Super | Corsair CX650M | LG 43UK6520PSA
ASUS X550LN | i5 4210u | 12GB
Lenovo N23 Yoga

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, igormp said:

Well, OP wants to accomplish a task, not necessarily learn how to do proper gpu programming nor do low level hardware programming (by at which point you could argue that only asm/shaders apply).

You don't need to hand write those Cuda kernels tho.

Well, then I guess to learn game dev I can just learn how to Unity suite? Who cares about optimizations and actually understanding what you're doing, right? Why not just use ChatGPT as a side tool? Because why not, it'll work. Somehow.

I wasn't talking about drivers, but rather about all that stuff that happens under the hood. And C is quite enough.   

There is approximately a 99% chance I edited my post

Refresh before you reply

 

Link to comment
Share on other sites

Link to post
Share on other sites

Looks like I forgot to reply to this thread. I'm still reading up and getting used to the concepts involved, which there are more of than I thought initially. The understanding required to interface to the GPU doesn't seem any different if I use some version of C or Python, but my gut feeling is Python is more friendly as a modern language. Not having to waste so much time on low level stuff is a big plus.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

I would consider having a look at the Julia programming. It has quite decent support for NVIDIA, Intel and AMD GPUs and is used a lot for scientific computing. The tooling in language expose a bunch of different programming paradigms and is generally nice to work with. Managing software dependencies is easier then in other languages. One disadvantage is that in Julia if a new function is called for the first time a lot of code gets compiled. Moving the compile time so close to run time gives you a lot of flexibility but it is painful for doing something once and then closing the program.

Link to comment
Share on other sites

Link to post
Share on other sites

11 minutes ago, Johann-Tobias xn--schg-noa said:

I would consider having a look at the Julia programming.

I haven't even heard of that one. In a quick skim it sounds like it has interesting features. A reservation is I may be less likely to find help than for more well known languages.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, porina said:

A reservation is I may be less likely to find help than for more well known languages.

That is a good observation and valid concern. One thing i would hold against that is since you compile at run time and all the different GPU targets are just LLVM back-ends you actually get better insight what is going on. You can dump the assembly of the GPU function you just ran and ended up being slower in the REPL and compare. Julia is rough around the edges but all the cool stuff in Julia are build in Julia itself and since Julia doesn't ship as binary you get deep insights how stuff happens and works should you want to or need to.

Link to comment
Share on other sites

Link to post
Share on other sites

On 7/29/2023 at 5:06 PM, porina said:

Python is more friendly as a modern language.

Python is not a modern language. I stopped using it about 25 years ago

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, Franck said:

Python is not a modern language. I stopped using it about 25 years ago

I'm coming from C. It's about 20 years younger.

Gaming system: R7 7800X3D, Asus ROG Strix B650E-F Gaming Wifi, Thermalright Phantom Spirit 120 SE ARGB, Corsair Vengeance 2x 32GB 6000C30, RTX 4070, MSI MPG A850G, Fractal Design North, Samsung 990 Pro 2TB, Acer Predator XB241YU 24" 1440p 144Hz G-Sync + HP LP2475w 24" 1200p 60Hz wide gamut
Productivity system: i9-7980XE, Asus X299 TUF mark 2, Noctua D15, 64GB ram (mixed), RTX 3070, NZXT E850, GameMax Abyss, Samsung 980 Pro 2TB, random 1080p + 720p displays.
Gaming laptop: Lenovo Legion 5, 5800H, RTX 3070, Kingston DDR4 3200C22 2x16GB 2Rx8, Kingston Fury Renegade 1TB + Crucial P1 1TB SSD, 165 Hz IPS 1080p G-Sync Compatible

Link to comment
Share on other sites

Link to post
Share on other sites

  • 3 weeks later...
On 7/31/2023 at 5:45 AM, Franck said:

Python is not a modern language. I stopped using it about 25 years ago

Python2 is not a modern language, Python3+ is definitely a modern language

Link to comment
Share on other sites

Link to post
Share on other sites

  • 2 weeks later...

Obsidian: GPU Programming in Haskell

https://svenssonjoel.github.io/writing/dccpaper_obsidian.pdf

 

GPU programming in Haskell

https://bobkonf.de/2015/slides/thielemann.pdf

 

Parallel and Concurrent Programming in Haskell

https://www.oreilly.com/library/view/parallel-and-concurrent/9781449335939/

 

If you were to ask yourself the question why Haskell and not Python:

extra-large.png

tornado=Python and warp=Haskell

 

Another fact is that good Haskell programmers are on average 2.5 times more productive than the best Python programmers.

 

Another great Python alternative:

https://juliagpu.org/

https://blogs.oracle.com/javamagazine/post/programming-the-gpu-in-java

https://github.com/takagi/cl-cuda

OS: FreeBSD 13.3  WM: bspwm  Hardware: Intel 12600KF -- Kingston dual-channel CL36 @6200 -- Sapphire RX 7600 -- BIOSTAR B760MZ-E PRO -- Antec P6 -- Xilence XP550 -- ARCTIC i35 -- EVO 850 500GB

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×