Jump to content

How do you work out how many calculation per second a GPU can do?

LibeRatioNZzXx
Go to solution Solved by MG2R,

The thing I'm finding difficult to come to terms with is float point opeartion?

A FLOP is one single operation on a floating point number. Floating point numbers are a way of representing decimal numbers in binary.

I'm just curious to whether there is an easy way to calculate how many calculations per second a GPU can do, if it's possible even better would be how to calculate how many of those calculations per second can be utilized.

 

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

Link to comment
Share on other sites

Link to post
Share on other sites

Just a guess but I'd say it's the same with CPUs in terms of how many FLOPs it can perform.

 

In the case of a GTX680 (http://en.wikipedia.org/wiki/GeForce_600_Series#GeForce_600_.286xx.29_series) it can perform 3090.4 GFLOPS

“The value of a college education is not the learning of many facts but the training of the mind to think”

 

Link to comment
Share on other sites

Link to post
Share on other sites

1MHz = 1,000,000 calculations, so 1260MHz = 1,260,000,000 calculations

Available from 3pm to Midnight Eastern Time (GMT-5). (>'-')> <('-'<) ^(' - ')^ <('-'<) (>'-')> You can't stop the kirby dance. 

4770k | Gigabyte GTX 970 Mini | Lian Li PC-TU100B | MSI Z87I 2x8GB G.Skill Sniper | Noctua NH-L9i Silverstone Strider 450W SFX | Windows 10 | 2x 250GB 840 Evo Rad 0 1x 1TB WD 2.5" | 25% gaming, 25% CAD and rendering, and 50% web browsing.
Link to comment
Share on other sites

Link to post
Share on other sites

1MHz = 1,000,000 calculations, so 1260MHz = 1,260,000,000 calculations

Problem with that is their's a core clock and memory clock, and from what I've read the average can do 4 billion calculations a second or 4GHz, but no GPU's core clock is clocked to that high a frequency

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

Link to comment
Share on other sites

Link to post
Share on other sites

Just a guess but I'd say it's the same with CPUs in terms of how many FLOPs it can perform.

 

In the case of a GTX680 (http://en.wikipedia.org/wiki/GeForce_600_Series#GeForce_600_.286xx.29_series) it can perform 3090.4 GFLOPS

My question would be is it measurable in anything simpler than flops? I was planning to use it in an algorithm for a program, and order to do that I need to be able to work out how long it will take to do all the calculations to find x.

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

Link to comment
Share on other sites

Link to post
Share on other sites

 

 

Problem with that is their's a core clock and memory clock, and from what I've read the average can do 4 billion calculations a second or 4GHz, but no GPU's core clock is clocked to that high a frequency

It was just a general assumption.

Available from 3pm to Midnight Eastern Time (GMT-5). (>'-')> <('-'<) ^(' - ')^ <('-'<) (>'-')> You can't stop the kirby dance. 

4770k | Gigabyte GTX 970 Mini | Lian Li PC-TU100B | MSI Z87I 2x8GB G.Skill Sniper | Noctua NH-L9i Silverstone Strider 450W SFX | Windows 10 | 2x 250GB 840 Evo Rad 0 1x 1TB WD 2.5" | 25% gaming, 25% CAD and rendering, and 50% web browsing.
Link to comment
Share on other sites

Link to post
Share on other sites

1MHz = 1,000,000 calculations, so 1260MHz = 1,260,000,000 calculations

 

Hz is not a unit for calculations, it a generic unit for frequency.  It describes the operating frequency of the GPU cores, but there are thousands of cores, and they do a very large number of calculations on each cycle.

Link to comment
Share on other sites

Link to post
Share on other sites

My question would be is it measurable in anything simpler than flops? I was planning to use it in an algorithm for a program, and order to do that I need to be able to work out how long it will take to do all the calculations to find x.

 

Not really xD That is the number of floating point operations it performs per second...

Link to comment
Share on other sites

Link to post
Share on other sites

My question would be is it measurable in anything simpler than flops? I was planning to use it in an algorithm for a program, and order to do that I need to be able to work out how long it will take to do all the calculations to find x.

Well, if the instructions in that calculation all take as long as a FLOP, then you can calculate the time with the FLOPs from the specs.

On the other hand, you could just write the code for the calculation and do something like this:

//pseudocodedouble start= getMilliSeconds();     //current time in milliseconds.doCalculation();                          //solve for xdouble stop=getMilliSeconds();double timeToPerformCalculation=stop-start        //in milliseconds

to find the time. If you are using Matlab, you can use 'tic' before doCalculation() and 'toc' after, this will print out the time needed to the console.

 

Keep in mind that this time is not set in stone, unless you have hardware level access to the cores (which you absolutely don't). The time it takes to perform your calculation is based on the scheduling algorithm used by the OS your application is running on.

Link to comment
Share on other sites

Link to post
Share on other sites

Not really xD That is the number of floating point operations it performs per second...

The thing I'm finding difficult to come to terms with is float point opeartion?

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

Link to comment
Share on other sites

Link to post
Share on other sites

The thing I'm finding difficult to come to terms with is float point opeartion?

A FLOP is one single operation on a floating point number. Floating point numbers are a way of representing decimal numbers in binary.

Link to comment
Share on other sites

Link to post
Share on other sites

Well, if the instructions in that calculation all take as long as a FLOP, then you can calculate the time with the FLOPs from the specs.

On the other hand, you could just write the code for the calculation and do something like this:

//pseudocodedouble start= getMilliSeconds();     //current time in milliseconds.doCalculation();                          //solve for xdouble stop=getMilliSeconds();double timeToPerformCalculation=stop-start        //in milliseconds

to find the time. If you are using Matlab, you can use 'tic' before doCalculation() and 'toc' after, this will print out the time needed to the console.

 

Keep in mind that this time is not set in stone, unless you have hardware level access to the cores (which you absolutely don't). The time it takes to perform your calculation is based on the scheduling algorithm used by the OS your application is running on.

Problem is I don't plan to do the calculations to be specific I want to try and create a hybrid of a system such as howsecureismypassword.net I don't plan to do the calculation myself to work out how long it will takes as that can be pretty long...

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

Link to comment
Share on other sites

Link to post
Share on other sites

A FLOP is one single operation on a floating point number. Floating point numbers are a way of representing decimal numbers in binary.

Ahh I can understand that! Thanks for the help!

| Cooler Master HAF 912 Plus | MSI P67A-G45 | i5 2500K @ 4.2GHz |  Coolermaster Hyper 212 Plus | EVGA GTX 670 FTW Edition 2GB | 8GB (2X4GB) Mushkin Blackline @ 1600MHz | 256GB OCZ Vertex 4 SSD | 1TB Western Digital Caviar Green | Corsair 600CX V2 | Windows 7 64-bit |

Link to comment
Share on other sites

Link to post
Share on other sites

Problem is I don't plan to do the calculations to be specific I want to try and create a hybrid of a system such as howsecureismypassword.net I don't plan to do the calculation myself to work out how long it will takes as that can be pretty long...

Why is it so important that you know how long the calculation takes to execute?

Link to comment
Share on other sites

Link to post
Share on other sites

Problem with that is their's a core clock and memory clock, and from what I've read the average can do 4 billion calculations a second or 4GHz, but no GPU's core clock is clocked to that high a frequency

I think you mean 4 million transistors?

I am pretty sure this is WRONG but Try this multiply cores by speed. (speed as in 1 mhz 1,000,000 calculations times cores)

Yeah....... This is soo wrong :P

Double check everything, I am usually wrong.

Link to comment
Share on other sites

Link to post
Share on other sites

How about working it backwards in order for the gou to kick out 60hz or fps at 1080p it must do the following

1920 x 1080= 2073600

Now multiply that by 3 since that is the number of sub pixels per pixel

=6220800

now multiply by 60 for 60fps

=373248000

So that is the number of bits the gpu must kick out if it is running 60 fps 1080p

Again this is probably wrong :P

Double check everything, I am usually wrong.

Link to comment
Share on other sites

Link to post
Share on other sites

I think you mean 4 million transistors?

I am pretty sure this is WRONG but Try this multiply cores by speed. (speed as in 1 mhz 1,000,000 calculations times cores)

Yeah....... This is soo wrong :P

 

there's something like Instructions per cycle otherwise an i3 wouldn't destroy FX-9590 in gaming.

So... If Jesus had the gold, would he buy himself out instead of waiting 3 days for the respawn?

CPU: Phenom II x6 1045t ][ GPU: GeForce 9600GT 512mb DDR3 ][ Motherboard: Gigabyte GA-MA770T-UD3P ][ RAM: 2x4GB Kingston 1333MHz CL9 DDR3 ][ HDD: Western Digital Green 2TB ][ PSU: Chieftec 500AB A ][ Case: No-name without airflow or dust filters Budget saved for an upgrade so far: 2400PLN (600€) - Initial 2800PLN (700€) Upgraded already: CPU

Link to comment
Share on other sites

Link to post
Share on other sites

there's something like Instructions per cycle otherwise an i3 wouldn't destroy FX-9590 in gaming.

Something like that?

Double check everything, I am usually wrong.

Link to comment
Share on other sites

Link to post
Share on other sites

How about working it backwards in order for the gou to kick out 60hz or fps at 1080p it must do the following

1920 x 1080= 2073600

Now multiply that by 3 since that is the number of sub pixels per pixel

=6220800

now multiply by 60 for 60fps

=373248000

So that is the number of bits the gpu must kick out if it is running 60 fps 1080p

Again this is probably wrong :P

 

 

:D

So... If Jesus had the gold, would he buy himself out instead of waiting 3 days for the respawn?

CPU: Phenom II x6 1045t ][ GPU: GeForce 9600GT 512mb DDR3 ][ Motherboard: Gigabyte GA-MA770T-UD3P ][ RAM: 2x4GB Kingston 1333MHz CL9 DDR3 ][ HDD: Western Digital Green 2TB ][ PSU: Chieftec 500AB A ][ Case: No-name without airflow or dust filters Budget saved for an upgrade so far: 2400PLN (600€) - Initial 2800PLN (700€) Upgraded already: CPU

Link to comment
Share on other sites

Link to post
Share on other sites

 

:D

Okay, A simple "Wrong" would have done just fine :)

Double check everything, I am usually wrong.

Link to comment
Share on other sites

Link to post
Share on other sites

Okay, A simple "Wrong" would have done just fine :)

 

Didn't mean to offend, I just like this movie and that exact quote. :D

So... If Jesus had the gold, would he buy himself out instead of waiting 3 days for the respawn?

CPU: Phenom II x6 1045t ][ GPU: GeForce 9600GT 512mb DDR3 ][ Motherboard: Gigabyte GA-MA770T-UD3P ][ RAM: 2x4GB Kingston 1333MHz CL9 DDR3 ][ HDD: Western Digital Green 2TB ][ PSU: Chieftec 500AB A ][ Case: No-name without airflow or dust filters Budget saved for an upgrade so far: 2400PLN (600€) - Initial 2800PLN (700€) Upgraded already: CPU

Link to comment
Share on other sites

Link to post
Share on other sites

Allright, seeing that there are a couple of people quite confused about what FLOPS actually are and how you can calculate it, please allow me to shed some light.

 

Consider that you have a single-core CPU running at 2GHz (2.000.000.000Hz). 2GHz means that the clock on this CPU will do a 0-to-1-transition (tick) two billion (giga) times per second (Hz).

Now, it might be easy to think that 2GHz=2GFLOPS. The truth, however, is a bit more complicated. A CPU can't do a instruction (in this case a calculation) in just one tick, there are multiple steps involved to make it happen. For example: you need to fetch the instruction, decode it, fetch the operands, perform the operation, store the result,... Now, let's assume our 2GHz single-core CPU needs 5 ticks to perform one instruction cylce. This means that our actual FLOPS will be: (2Gticks/s) / ( 5ticks/cycle ) = 400Mcycles/s. Considering an instruction taking one cycle to execute, this gives us 400MFLOPS.

 

Now, consider that we have a dual-core CPU of the same architecture and clock speed. Each core would be able to perform 400MFLOPS, as calculated above. The total performance results in 2 * ((2Gticks/s) / (5tick/cycle)) = 800MFLOPS. This is the same formula in the wikipedia article mentioned by @T.Vengeance in his post above.

 

If we apply the same logic to a GPU, running at 1GHz, with 1000 cores and 5 ticks/cycle, this gives us 1000cores * 1GHz / 5ticks/cycle =200GFLOPS. This is why a GPU is so much powerfull than a CPU when it comes to pure number crunching: each core runs slower and is waay less complex than a CPU core, but you have a 1000 cores that are all working together.

Link to comment
Share on other sites

Link to post
Share on other sites

How about working it backwards in order for the gou to kick out 60hz or fps at 1080p it must do the following

1920 x 1080= 20736001080

Now multiply that by 3 since that is the number of sub pixels per pixel

=6220800

now multiply by 60 for 60fps

=373248000

So that is the number of bits the gpu must kick out if it is running 60 fps 1080p

Again this is probably wrong :P

Actually, there's quite a bit of sense to that. Although the number of bits is actually 8 times higher, because every pixel is set with 3 bytes (one for every sub-pixel).

 

So the actual data rate is 1920*1080*60*3*8=2.985.984.000 or almost 3Gb/s or about 373MB/s

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×