# How do you work out how many calculation per second a GPU can do?

The thing I'm finding difficult to come to terms with is float point opeartion?

A FLOP is one single operation on a floating point number. Floating point numbers are a way of representing decimal numbers in binary.

## Recommended Posts

Posted · Original PosterOP

I'm just curious to whether there is an easy way to calculate how many calculations per second a GPU can do, if it's possible even better would be how to calculate how many of those calculations per second can be utilized.

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

##### Share on other sites

Just a guess but I'd say it's the same with CPUs in terms of how many FLOPs it can perform.

In the case of a GTX680 (http://en.wikipedia.org/wiki/GeForce_600_Series#GeForce_600_.286xx.29_series) it can perform 3090.4 GFLOPS

“The value of a college education is not the learning of many facts but the training of the mind to think”

##### Share on other sites

1MHz = 1,000,000 calculations, so 1260MHz = 1,260,000,000 calculations

Available from 3pm to Midnight Eastern Time (GMT-5). (>'-')> <('-'<) ^(' - ')^ <('-'<) (>'-')> You can't stop the kirby dance.

4770k | Gigabyte GTX 970 Mini | Lian Li PC-TU100B | MSI Z87I 2x8GB G.Skill Sniper | Noctua NH-L9i Silverstone Strider 450W SFX | Windows 10 | 2x 250GB 840 Evo Rad 0 1x 1TB WD 2.5" | 25% gaming, 25% CAD and rendering, and 50% web browsing.
##### Share on other sites

1MHz = 1,000,000 calculations, so 1260MHz = 1,260,000,000 calculations

Depends actually: http://en.wikipedia.org/wiki/FLOPS#Computing

“The value of a college education is not the learning of many facts but the training of the mind to think”

##### Share on other sites
Posted · Original PosterOP

1MHz = 1,000,000 calculations, so 1260MHz = 1,260,000,000 calculations

Problem with that is their's a core clock and memory clock, and from what I've read the average can do 4 billion calculations a second or 4GHz, but no GPU's core clock is clocked to that high a frequency

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

##### Share on other sites
Posted · Original PosterOP

Just a guess but I'd say it's the same with CPUs in terms of how many FLOPs it can perform.

In the case of a GTX680 (http://en.wikipedia.org/wiki/GeForce_600_Series#GeForce_600_.286xx.29_series) it can perform 3090.4 GFLOPS

My question would be is it measurable in anything simpler than flops? I was planning to use it in an algorithm for a program, and order to do that I need to be able to work out how long it will take to do all the calculations to find x.

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

##### Share on other sites

Problem with that is their's a core clock and memory clock, and from what I've read the average can do 4 billion calculations a second or 4GHz, but no GPU's core clock is clocked to that high a frequency

It was just a general assumption.

Available from 3pm to Midnight Eastern Time (GMT-5). (>'-')> <('-'<) ^(' - ')^ <('-'<) (>'-')> You can't stop the kirby dance.

4770k | Gigabyte GTX 970 Mini | Lian Li PC-TU100B | MSI Z87I 2x8GB G.Skill Sniper | Noctua NH-L9i Silverstone Strider 450W SFX | Windows 10 | 2x 250GB 840 Evo Rad 0 1x 1TB WD 2.5" | 25% gaming, 25% CAD and rendering, and 50% web browsing.
##### Share on other sites

1MHz = 1,000,000 calculations, so 1260MHz = 1,260,000,000 calculations

Hz is not a unit for calculations, it a generic unit for frequency.  It describes the operating frequency of the GPU cores, but there are thousands of cores, and they do a very large number of calculations on each cycle.

##### Share on other sites

My question would be is it measurable in anything simpler than flops? I was planning to use it in an algorithm for a program, and order to do that I need to be able to work out how long it will take to do all the calculations to find x.

Not really xD That is the number of floating point operations it performs per second...

##### Share on other sites

My question would be is it measurable in anything simpler than flops? I was planning to use it in an algorithm for a program, and order to do that I need to be able to work out how long it will take to do all the calculations to find x.

Well, if the instructions in that calculation all take as long as a FLOP, then you can calculate the time with the FLOPs from the specs.

On the other hand, you could just write the code for the calculation and do something like this:

`//pseudocodedouble start= getMilliSeconds();     //current time in milliseconds.doCalculation();                          //solve for xdouble stop=getMilliSeconds();double timeToPerformCalculation=stop-start        //in milliseconds`

to find the time. If you are using Matlab, you can use 'tic' before doCalculation() and 'toc' after, this will print out the time needed to the console.

Keep in mind that this time is not set in stone, unless you have hardware level access to the cores (which you absolutely don't). The time it takes to perform your calculation is based on the scheduling algorithm used by the OS your application is running on.

##### Share on other sites
Posted · Original PosterOP

Not really xD That is the number of floating point operations it performs per second...

The thing I'm finding difficult to come to terms with is float point opeartion?

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

##### Share on other sites

The thing I'm finding difficult to come to terms with is float point opeartion?

A FLOP is one single operation on a floating point number. Floating point numbers are a way of representing decimal numbers in binary.

##### Share on other sites
Posted · Original PosterOP

Well, if the instructions in that calculation all take as long as a FLOP, then you can calculate the time with the FLOPs from the specs.

On the other hand, you could just write the code for the calculation and do something like this:

`//pseudocodedouble start= getMilliSeconds();     //current time in milliseconds.doCalculation();                          //solve for xdouble stop=getMilliSeconds();double timeToPerformCalculation=stop-start        //in milliseconds`

to find the time. If you are using Matlab, you can use 'tic' before doCalculation() and 'toc' after, this will print out the time needed to the console.

Keep in mind that this time is not set in stone, unless you have hardware level access to the cores (which you absolutely don't). The time it takes to perform your calculation is based on the scheduling algorithm used by the OS your application is running on.

Problem is I don't plan to do the calculations to be specific I want to try and create a hybrid of a system such as howsecureismypassword.net I don't plan to do the calculation myself to work out how long it will takes as that can be pretty long...

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

##### Share on other sites
Posted · Original PosterOP

A FLOP is one single operation on a floating point number. Floating point numbers are a way of representing decimal numbers in binary.

Ahh I can understand that! Thanks for the help!

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

##### Share on other sites

Problem is I don't plan to do the calculations to be specific I want to try and create a hybrid of a system such as howsecureismypassword.net I don't plan to do the calculation myself to work out how long it will takes as that can be pretty long...

Why is it so important that you know how long the calculation takes to execute?

##### Share on other sites

Problem with that is their's a core clock and memory clock, and from what I've read the average can do 4 billion calculations a second or 4GHz, but no GPU's core clock is clocked to that high a frequency

I think you mean 4 million transistors?

I am pretty sure this is WRONG but Try this multiply cores by speed. (speed as in 1 mhz 1,000,000 calculations times cores)

Yeah....... This is soo wrong :P

Awesome profile picture huh? Please double check anything I say to make sure it is correct, because I am not always correct.

Intel i5 4670k @4.1ghz - EVGA GTX 760  - Corsair 8 GB Vengeance @1866 - Gigabyte Z87 UD3H - Corsair H100i - Samsung 120 GB EVO + Seagate 2 TB 7200rpm - Corsair TX850M - Fractal Arc Midi R2 - Logitech K120 Keyboard (\$10) - ASUS VS239H-P 23-Inch Full HD LED IPS Monitor + BenQ T91W TN 900p 19" - HP PhotoSmart C4680 - Genius SP-S110 \$12) - Rosewill N900PCE - Blue Snowball - Perixx MX-2000II- Audio Technica ATH-M30

##### Share on other sites

How about working it backwards in order for the gou to kick out 60hz or fps at 1080p it must do the following

1920 x 1080= 2073600

Now multiply that by 3 since that is the number of sub pixels per pixel

=6220800

now multiply by 60 for 60fps

=373248000

So that is the number of bits the gpu must kick out if it is running 60 fps 1080p

Again this is probably wrong :P

Awesome profile picture huh? Please double check anything I say to make sure it is correct, because I am not always correct.

Intel i5 4670k @4.1ghz - EVGA GTX 760  - Corsair 8 GB Vengeance @1866 - Gigabyte Z87 UD3H - Corsair H100i - Samsung 120 GB EVO + Seagate 2 TB 7200rpm - Corsair TX850M - Fractal Arc Midi R2 - Logitech K120 Keyboard (\$10) - ASUS VS239H-P 23-Inch Full HD LED IPS Monitor + BenQ T91W TN 900p 19" - HP PhotoSmart C4680 - Genius SP-S110 \$12) - Rosewill N900PCE - Blue Snowball - Perixx MX-2000II- Audio Technica ATH-M30

##### Share on other sites

I think you mean 4 million transistors?

I am pretty sure this is WRONG but Try this multiply cores by speed. (speed as in 1 mhz 1,000,000 calculations times cores)

Yeah....... This is soo wrong :P

there's something like Instructions per cycle otherwise an i3 wouldn't destroy FX-9590 in gaming.

So... If Jesus had the gold, would he buy himself out instead of waiting 3 days for the respawn?

][ GPU: GeForce 9600GT 512mb DDR3 ][ Motherboard: Gigabyte GA-MA770T-UD3P ][ RAM: 2x4GB Kingston 1333MHz CL9 DDR3 ][ HDD: Western Digital Green 2TB ][ PSU: Chieftec 500AB A ][ Case: No-name without airflow or dust filters Budget saved for an upgrade so far: 2400PLN (600€) - Initial 2800PLN (700€) Upgraded already: CPU

##### Share on other sites

there's something like Instructions per cycle otherwise an i3 wouldn't destroy FX-9590 in gaming.

Something like that?

Awesome profile picture huh? Please double check anything I say to make sure it is correct, because I am not always correct.

Intel i5 4670k @4.1ghz - EVGA GTX 760  - Corsair 8 GB Vengeance @1866 - Gigabyte Z87 UD3H - Corsair H100i - Samsung 120 GB EVO + Seagate 2 TB 7200rpm - Corsair TX850M - Fractal Arc Midi R2 - Logitech K120 Keyboard (\$10) - ASUS VS239H-P 23-Inch Full HD LED IPS Monitor + BenQ T91W TN 900p 19" - HP PhotoSmart C4680 - Genius SP-S110 \$12) - Rosewill N900PCE - Blue Snowball - Perixx MX-2000II- Audio Technica ATH-M30

##### Share on other sites

How about working it backwards in order for the gou to kick out 60hz or fps at 1080p it must do the following

1920 x 1080= 2073600

Now multiply that by 3 since that is the number of sub pixels per pixel

=6220800

now multiply by 60 for 60fps

=373248000

So that is the number of bits the gpu must kick out if it is running 60 fps 1080p

Again this is probably wrong :P

:D

So... If Jesus had the gold, would he buy himself out instead of waiting 3 days for the respawn?

][ GPU: GeForce 9600GT 512mb DDR3 ][ Motherboard: Gigabyte GA-MA770T-UD3P ][ RAM: 2x4GB Kingston 1333MHz CL9 DDR3 ][ HDD: Western Digital Green 2TB ][ PSU: Chieftec 500AB A ][ Case: No-name without airflow or dust filters Budget saved for an upgrade so far: 2400PLN (600€) - Initial 2800PLN (700€) Upgraded already: CPU

##### Share on other sites

:D

Okay, A simple "Wrong" would have done just fine

Awesome profile picture huh? Please double check anything I say to make sure it is correct, because I am not always correct.

Intel i5 4670k @4.1ghz - EVGA GTX 760  - Corsair 8 GB Vengeance @1866 - Gigabyte Z87 UD3H - Corsair H100i - Samsung 120 GB EVO + Seagate 2 TB 7200rpm - Corsair TX850M - Fractal Arc Midi R2 - Logitech K120 Keyboard (\$10) - ASUS VS239H-P 23-Inch Full HD LED IPS Monitor + BenQ T91W TN 900p 19" - HP PhotoSmart C4680 - Genius SP-S110 \$12) - Rosewill N900PCE - Blue Snowball - Perixx MX-2000II- Audio Technica ATH-M30

##### Share on other sites

Okay, A simple "Wrong" would have done just fine

Didn't mean to offend, I just like this movie and that exact quote. :D

So... If Jesus had the gold, would he buy himself out instead of waiting 3 days for the respawn?

][ GPU: GeForce 9600GT 512mb DDR3 ][ Motherboard: Gigabyte GA-MA770T-UD3P ][ RAM: 2x4GB Kingston 1333MHz CL9 DDR3 ][ HDD: Western Digital Green 2TB ][ PSU: Chieftec 500AB A ][ Case: No-name without airflow or dust filters Budget saved for an upgrade so far: 2400PLN (600€) - Initial 2800PLN (700€) Upgraded already: CPU

##### Share on other sites

Allright, seeing that there are a couple of people quite confused about what FLOPS actually are and how you can calculate it, please allow me to shed some light.

Consider that you have a single-core CPU running at 2GHz (2.000.000.000Hz). 2GHz means that the clock on this CPU will do a 0-to-1-transition (tick) two billion (giga) times per second (Hz).

Now, it might be easy to think that 2GHz=2GFLOPS. The truth, however, is a bit more complicated. A CPU can't do a instruction (in this case a calculation) in just one tick, there are multiple steps involved to make it happen. For example: you need to fetch the instruction, decode it, fetch the operands, perform the operation, store the result,... Now, let's assume our 2GHz single-core CPU needs 5 ticks to perform one instruction cylce. This means that our actual FLOPS will be: (2Gticks/s) / ( 5ticks/cycle ) = 400Mcycles/s. Considering an instruction taking one cycle to execute, this gives us 400MFLOPS.

Now, consider that we have a dual-core CPU of the same architecture and clock speed. Each core would be able to perform 400MFLOPS, as calculated above. The total performance results in 2 * ((2Gticks/s) / (5tick/cycle)) = 800MFLOPS. This is the same formula in the wikipedia article mentioned by @T.Vengeance in his post above.

If we apply the same logic to a GPU, running at 1GHz, with 1000 cores and 5 ticks/cycle, this gives us 1000cores * 1GHz / 5ticks/cycle =200GFLOPS. This is why a GPU is so much powerfull than a CPU when it comes to pure number crunching: each core runs slower and is waay less complex than a CPU core, but you have a 1000 cores that are all working together.

##### Share on other sites

:D

I love that fragment :D

##### Share on other sites

How about working it backwards in order for the gou to kick out 60hz or fps at 1080p it must do the following

1920 x 1080= 20736001080

Now multiply that by 3 since that is the number of sub pixels per pixel

=6220800

now multiply by 60 for 60fps

=373248000

So that is the number of bits the gpu must kick out if it is running 60 fps 1080p

Again this is probably wrong :P

Actually, there's quite a bit of sense to that. Although the number of bits is actually 8 times higher, because every pixel is set with 3 bytes (one for every sub-pixel).

So the actual data rate is 1920*1080*60*3*8=2.985.984.000 or almost 3Gb/s or about 373MB/s

## Create an account

Register a new account