Is there a way to make this code even faster?(C++)

pipnina · November 4, 2016

I'm making a prime number generator, my lecturer has said that the fastest code gets £20 (GBP) of his own money (With the catch that he's also in the contest).

I think my code is pretty quick, but wondered if there were any further ways to optimize it.

    //Determining if a number is prime
    for(int primeSearch=3; primeSearch<COUNT_TO; primeSearch+=2){  //Because we skip even numbers, we know we don't need to test the prime candidates by dividing them by two or ten.
        bool primeFound = false;
        for(unsigned int vectorScan=0; vectorScan<=primeVector.size(); vectorScan++){                   //For every prime number in the vector:
            if(primeSearch%primeVector[vectorScan]==0 && primeSearch!=primeVector[vectorScan]){break;}  //Make sure that the candidate is not divisible by any of the previous primes.
            if(primeVector[vectorScan]>(primeSearch/2)){                                                //If the prime number isn't disproved by the time we've started dividing by-
                primeFound=true;                                                                        //-more than half of the candidate's value, we assume it's prime.
                break;
            }
        }
        if(primeFound == true){
            primeVector.push_back(primeSearch);                                                        //Add the prime number to the list
        }
    }

Currently it executes between 100 and 200 milliseconds when generating prime numbers below a value of 100'000 (In release mode, on a 6700k)

Thanks

~pip

Cryosec · November 4, 2016

I'm thinking multithreading, but I can't help code-wise in this case

Nineshadow · November 4, 2016

Sieve of Eratosthenes. O(n loglog n). There are many ways to improve the efficiency afterwards.

espurritado · November 4, 2016

when you get to the point where you say it's prime here:

58 minutes ago, pipnina said:

if(primeVector[vectorScan]>(primeSearch/2))

you shouldn't compare to primeSearch/2, instead compare to the square root of primeSearch.

Also

55 minutes ago, Cryosec said:

I'm thinking multithreading, but I can't help code-wise in this case

multi-threading wouldn't be an option in this case since you are dependent of the previous execution of the loop in order to execute the next one, if you use multi-threading you might skip a prime, or wrongly categorize a number as prime, specially early on.

Cryosec · November 4, 2016

4 minutes ago, espurritado said:

multi-threading wouldn't be an option in this case since you are dependent of the previous execution of the loop in order to execute the next one, if you use multi-threading you might skip a prime, or wrongly categorize a number as prime, specially early on.

I've learned something new today

pipnina · November 4, 2016

5 minutes ago, espurritado said:

when you get to the point where you say it's prime here:

you shouldn't compare to primeSearch/2, instead compare to the square root of primeSearch.

Also

multi-threading wouldn't be an option in this case since you are dependent of the previous execution of the loop in order to execute the next one, if you use multi-threading you might skip a prime, or wrongly categorize a number as prime, specially early on.

Thanks for that one! I was just trying to tweak that primeSearch/2 value.

Nineshadow · November 4, 2016

30 minutes ago, gabrielcarvfer said:

@pipnina he asked for real primes? When calculating big primes (pseudoprimes), people usually use Fermat or Miller-Rabin/Solavay-Strassen to check for primality of big numbers.

You can use multithreading if checking for real primes, but not in a general way, as espurritado just mentioned, because there is data dependency. You could, for example, use a single thread to find early primes, and then use multiple thread to check for later primes, as the density of primes drops really fast as the number increases. The only complication there would be making a thread that is holding a higher value than other threads until all lower value threads finish, so that you can guarantee the results.

If using Fermat, or other primality test alike, you can have tons of threads calculating pseudoprimes.

I think he's just generating all prime numbers in a set range. If that is the case, I'd rather use the sieve of Eratosthenes. It's simple not only in concept but also in implementation. Read about it here : https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes .

#include <bits/stdc++.h>
static const int N=0x989680;
bool p[N+1];
int main() {

    std::fill(std::begin(p)+2,std::end(p),1);
    for(int i=2;i<=N;i++)
        if(p[i])
            for(int k=2;k*i<=N;++k)
                p[k*i]=0;
}

This generates all prime numbers smaller than 10'000'000 in around 120 milliseconds for me. For a range of up to 100'000, it does it in about 0.5 milliseconds. I'm working on an i7 4710HQ.

The algorithm has a complexity of O(nloglogn), but memory is O(n).

Anyway, you can further optimize it in pretty obvious manners, although they are mostly unnecessary for what you're trying to do, I think.

Firstly, any even number with the exception of 2 isn't prime, that's pretty obvious.

One technique often used in algorithmic problems is splitting the workload in segments of sqrt n. This reduces memory usage to O(sqrt n) and increases performance if we choose the sizes of our segments accordingly, that is if they can fit into the CPU's cache memory, and not RAM.

Another technique we could use here is called wheel factorization, which is basically used to skip multiples of small primes. You can, for example, use a modulo 210 wheel to skip multiples of 2,3,5 and 7. Read more about it here : https://en.wikipedia.org/wiki/Wheel_factorization

Parallelization is yet another thing that could improve the execution time.

wanderingfool2 · November 5, 2016

7 hours ago, pipnina said:

Thanks for that one! I was just trying to tweak that primeSearch/2 value.

It's been a while since programming for me, so excuse me if I mess up my thoughts a bit.

While it is true that sqrt(primeSearch) > p(i) (where p(i) is some prime) is a good way to detect that you don't need to do any more calculations, remember not to put sqrt(primSearch) into the inner for loop.

Consider the following

int z, i;
for(z=0; z < 10000; z++) {
    for(i=0; sqrt(z) > i; ++i) { //sqrt(z) gets evaluated every single time for i, despite it effectively being static
  		//Some code here
    }
}
  
//Now consider the following
int z, i, sqt;
for(z=0; z < 10000; z++) {
    sqt = (int) sqrt(z);
    for(i=0; sqt > i; ++i) { //Only evaluated once
  		//Some code here
    }
}

Actually, someone please correct me if I am wrong, but doing a nested for loop with for(int i....is that more, less, or equivalent efficiency to declaring the variables outside of the loop?

mariushm · November 5, 2016

Well you don't say the exact way the problem is stated so depending on restrictions you could be very clever.

For example, if the user will only ever enter a number between 0 and 100k to test your app, you could just "cache" all the prime numbers in an array in your application... the list of these prime numbers if 58 KB in nice formatted html page: http://primos.mat.br/primeiros_10000_primos.txt but you could just define a constant array of numbers and type each number in its hexadecimal notation ex 0x71EF for 29167

The lookup in the array is going to be faster than determining if the number is prime or not.

For higher numbers, you could look into OpenCL ... basically testing multiple numbers in parallel to see if they're prime or not, by getting all those hundreds of compute units to work.

Cellebyte · November 5, 2016

Using bitshifts it is a lot faster then your Implementation. it is the implementation of Sieve of Eratosthenes:

On 4.11.2016 at 8:48 PM, Nineshadow said:
I'd rather use the sieve of Eratosthenes. It's simple not only in concept but also in implementation. Read about it here : https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes .
#include <bits/stdc++.h>
static const int N=0x989680;
bool p[N+1];
int main() {

    std::fill(std::begin(p)+2,std::end(p),1);
    for(int i=2;i<=N;i++)
        if(p[i])
            for(int k=2;k*i<=N;++k)
                p[k*i]=0;
}

straight_stewie · November 6, 2016

On 11/4/2016 at 0:17 PM, pipnina said:

I'm making a prime number generator, my lecturer has said that the fastest code gets £20 (GBP) of his own money (With the catch that he's also in the contest).

I think my code is pretty quick, but wondered if there were any further ways to optimize it.


    //Determining if a number is prime
    for(int primeSearch=3; primeSearch<COUNT_TO; primeSearch+=2){  //Because we skip even numbers, we know we don't need to test the prime candidates by dividing them by two or ten.
        bool primeFound = false;
        for(unsigned int vectorScan=0; vectorScan<=primeVector.size(); vectorScan++){                   //For every prime number in the vector:
            if(primeSearch%primeVector[vectorScan]==0 && primeSearch!=primeVector[vectorScan]){break;}  //Make sure that the candidate is not divisible by any of the previous primes.
            if(primeVector[vectorScan]>(primeSearch/2)){                                                //If the prime number isn't disproved by the time we've started dividing by-
                primeFound=true;                                                                        //-more than half of the candidate's value, we assume it's prime.
                break;
            }
        }
        if(primeFound == true){
            primeVector.push_back(primeSearch);                                                        //Add the prime number to the list
        }
    }

Currently it executes between 100 and 200 milliseconds when generating prime numbers below a value of 100'000 (In release mode, on a 6700k)

Thanks

~pip

This is essentially somewhat optimized trial division. Another optimization that you can make here, which I think that @espurritado mentioned was that you only need to check up to the square root of the maximum number in your range. This is because that, for every number n, there is a number F₁ < √n such that there is a corresponding multiple of F₁, √n < F₂< n.

This is a really complicated way of saying: For every factor of a number below it's square root, there is exactly one and only one corresponding factor above that numbers square root. There are no factors on either side of a numbers square root that do not have a corresponding factor on the other side of the root. The factor higher than the square root may be calculated as follows: F₂= n / F₁ Where F₁is the smaller factor.

Beyond that, it's generally considered that the fastest way to generate primes is a Sieve of Atkin, but good luck with that. The next best way is a Sieve of Eratosthenes.

Sign In

Is there a way to make this code even faster?(C++)

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Link to comment

Share on other sites

Link to post

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Featured Topics

Topics

Latest From Linus Tech Tips:

I don’t want to clean this.

Latest From Tech Quickie:

Ethernet Is Named After Something Really Dumb (and other tech stories)

Latest From TechLinked:

I Never Thought I’d See This…

Latest From GameLinked:

Will Assassin's Creed Shadows be good? - Gamers 2 Gamers

Latest From ShortCircuit:

I think y’all overlooked something… - Pixel 8a

Latest From Mac Address:

Why did you buy an Apple Vision Pro?

Latest From Channel Super Fun:

I Swapped the CEO's Assistant For a Day!