Jump to content

Compute Units in GPU

Cysiu

Hello, i was thinking about , what are the Compute Units in GPU

Example : RTX 3080 have 68 Compute Units

Did they have any special meaning in gaming or taking part in graphic card efficiency?

( And I'll appreciate some theory )

Link to comment
Share on other sites

Link to post
Share on other sites

The Compute Unit (or SM) is just a group of Cuda Cores. In the 3080's case, each SM contains 128 Cuda Cores. 68*128 =8704. 

That's how they do the 3090 for example, that has all 82 SMs enabled, which measn 10496 Cuda Cores. 

 

EDIT: The full GA102 even has 84 SMs. So both the 3090 and 3080 are just cut down versions of the same GPU. 

Gaming HTPC:

R5 5600X - Cryorig C7 - Asus ROG B350-i - EVGA RTX2060KO - 16gb G.Skill Ripjaws V 3333mhz - Corsair SF450 - 500gb 960 EVO - LianLi TU100B


Desktop PC:
R9 3900X - Peerless Assassin 120 SE - Asus Prime X570 Pro - Powercolor 7900XT - 32gb LPX 3200mhz - Corsair SF750 Platinum - 1TB WD SN850X - CoolerMaster NR200 White - Gigabyte M27Q-SA - Corsair K70 Rapidfire - Logitech MX518 Legendary - HyperXCloud Alpha wireless


Boss-NAS [Build Log]:
R5 2400G - Noctua NH-D14 - Asus Prime X370-Pro - 16gb G.Skill Aegis 3000mhz - Seasonic Focus Platinum 550W - Fractal Design R5 - 
250gb 970 Evo (OS) - 2x500gb 860 Evo (Raid0) - 6x4TB WD Red (RaidZ2)

Synology-NAS:
DS920+
2x4TB Ironwolf - 1x18TB Seagate Exos X20

 

Audio Gear:

Hifiman HE-400i - Kennerton Magister - Beyerdynamic DT880 250Ohm - AKG K7XX - Fostex TH-X00 - O2 Amp/DAC Combo - 
Klipsch RP280F - Klipsch RP160M - Klipsch RP440C - Yamaha RX-V479

 

Reviews and Stuff:

GTX 780 DCU2 // 8600GTS // Hifiman HE-400i // Kennerton Magister
Folding all the Proteins! // Boincerino

Useful Links:
Do you need an AMP/DAC? // Recommended Audio Gear // PSU Tier List 

Link to comment
Share on other sites

Link to post
Share on other sites

its basically what its said it, was a clusters of cuda core

so rtx 3080 have 8704 cuda core and 68 compute units

so you devided total core count with total compute units

 

and you got 128 so 1 compute units have 128 core and iirc vega igpu have 64 core per compute units

so yeah, basically other way to say spesification

01110100 01101000 01100001 01110100 00100000 01110111 01100001 01110011 00100000 00110111 00110000 00100000 01101001 01101110 01100011 01101000 00100000 01110000 01101100 01100001 01110011 01101101 01100001 00100000 01110011 01100011 01110010 01100101 01100101 01101110 00100000 01110100 01110110

 

 

 

 

 

 

 

 

 

 

Audio Interface I/O LIST v2

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

More compute units = more parallel computing = higher performance. Also brings more power draw, more memory and PCIe bandwidth usage etc. Efficiency depends on architecture, process node, frequency and voltages. Number of compute units is not a factor.

CPU: i7-2600K 4751MHz 1.44V (software) --> 1.47V at the back of the socket Motherboard: Asrock Z77 Extreme4 (BCLK: 103.3MHz) CPU Cooler: Noctua NH-D15 RAM: Adata XPG 2x8GB DDR3 (XMP: 2133MHz 10-11-11-30 CR2, custom: 2203MHz 10-11-10-26 CR1 tRFC:230 tREFI:14000) GPU: Asus GTX 1070 Dual (Super Jetstream vbios, +70(2025-2088MHz)/+400(8.8Gbps)) SSD: Samsung 840 Pro 256GB (main boot drive), Transcend SSD370 128GB PSU: Seasonic X-660 80+ Gold Case: Antec P110 Silent, 5 intakes 1 exhaust Monitor: AOC G2460PF 1080p 144Hz (150Hz max w/ DP, 121Hz max w/ HDMI) TN panel Keyboard: Logitech G610 Orion (Cherry MX Blue) with SteelSeries Apex M260 keycaps Mouse: BenQ Zowie FK1

 

Model: HP Omen 17 17-an110ca CPU: i7-8750H (0.125V core & cache, 50mV SA undervolt) GPU: GTX 1060 6GB Mobile (+80/+450, 1650MHz~1750MHz 0.78V~0.85V) RAM: 8+8GB DDR4-2400 18-17-17-39 2T Storage: HP EX920 1TB PCIe x4 M.2 SSD + Crucial MX500 1TB 2.5" SATA SSD, 128GB Toshiba PCIe x2 M.2 SSD (KBG30ZMV128G) gone cooking externally, 1TB Seagate 7200RPM 2.5" HDD (ST1000LM049-2GH172) left outside Monitor: 1080p 126Hz IPS G-sync

 

Desktop benching:

Cinebench R15 Single thread:168 Multi-thread: 833 

SuperPi (v1.5 from Techpowerup, PI value output) 16K: 0.100s 1M: 8.255s 32M: 7m 45.93s

Link to comment
Share on other sites

Link to post
Share on other sites

53 minutes ago, Cysiu said:

Did they have any special meaning in gaming or taking part in graphic card efficiency?

It's just a way of logically grouping things together on a GPU. Your GPU has a certain number of Cuda cores or Stream Processors (SP). Every n of these SPs form a logical unit, a Compute Unit (CU) or Streaming Multiprocessor (SM). Each unit may then be combined with or share access to other components such as Texture Mapping Units (TMUs) or Render Ouput Units (ROPs).

 

For us users, these technical specs are somewhat meaningless. A card of the same generation with more CUs is certainly going to be faster compared to a card with less CUs. But you can't compare them across generations and/or vendors . The number of SPs/ROPs/TMUs per CU may change, their performance and efficiency may change and so on.

 

At the end of the day you'll want to look at benchmarks to see how the performance of these cards differs, because double the core count doesn't always translate to double the performance.

Remember to either quote or @mention others, so they are notified of your reply

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×