12 Replies - 2396 Views - Last Post: 20 November 2012 - 08:33 PM Rate Topic: -----

#1 kiwi_steve  Icon User is offline

  • D.I.C Head

Reputation: 31
  • View blog
  • Posts: 109
  • Joined: 26-September 09

Prime test for the sieve of eratosthenes

Posted 19 November 2012 - 09:51 PM

I'm just tinkering with this method of making a list of prime numbers (as part of working on a SPOJ problem). I'm making a list of numbers, then setting the ones that aren't prime to zero (since removing them would be very time consuming on a list structure), then making a new list of the non-zero items to return when its done.

To check if the number is prime, I'm using the following sub-function:

def check_prime(num):
    count = 0
    for test in range(3, num, 2):
        if not test % num:
            count += 1
    return count == 0


which I also rewrote as a simple list comprehension:

def check_prime2(num):
    test = [x for x in range(3, num, 2) if not x % num]
    return len(test) == 0


And in testing the difference in speed of the two functions is negligible - which is interesting because I was expecting the LC method to be a little faster.

I've never written a sieve before, so I could be doing it all wrong - I'm still tinkering with it. The main code is here:

import math

def primes(max):
    limit = int(math.ceil(math.sqrt(max)))
    sieve = [x for x in range(3, max + 1, 2)]
    #print sieve
    for num in range(3, limit, 2):
        if sieve[(num - 3) / 2] and check_prime(num):  # If its prime, knock out all multiples
                for i in range(num * 2, max + 1, num):
                    if i % 2:
                        sieve[(i - 3) / 2] = 0
    final = [x for x in sieve if x > 0]
    return final



It doesn't bother making the first list (sieve) with any even numbers in it, and in its current state it also doesn't add the '2' to the start (this gets missed to avoid having any even numbers in the list at all - still thinking of a neat way to add this before the LC creates the list otherwise it needs to shift the entire list just to stick it in there). At this point its more about timing it and speeding it up rather than returning a correct list.

Yes, I could Google it - I'm more interested in opinions and ideas than someone elses code. Can anyone poke holes in how I've done it? Constructive criticism would be great, cheers

Steve

Is This A Good Question/Topic? 0
  • +

Replies To: Prime test for the sieve of eratosthenes

#2 jon.kiparsky  Icon User is online

  • Pancakes!
  • member icon


Reputation: 7752
  • View blog
  • Posts: 13,110
  • Joined: 19-March 11

Re: Prime test for the sieve of eratosthenes

Posted 19 November 2012 - 10:24 PM

Here's one thing I notice.


def check_prime(num):
    count = 0
    for test in range(3, num, 2):
        if not test % num:
            count += 1  # When this happens, you know you can return False, right?
    return count == 0   # so if you get to here, then, return True




def check_prime2(num):
    test = [x for x in range(3, num, 2) if not x % num]
    return len(test) == 0



Interesting use of a listcomp here, but it's a little weird since you're not actually interested in the list. You might also think of using a filter here, but I would think that the for loop offers you a real advantage with the early exit. I imagine you could define a higher-ordered "exists(l,p)" function that would return true if it discovers am element of list l which has property p, but it would probably look a lot like your for loop. (if I'm wrong, of course, I'd love to know about it)

Quote

And in testing the difference in speed of the two functions is negligible - which is interesting because I was expecting the LC method to be a little faster.


I would actually expect that implementing an early exit might make the for variant marginally faster. How high did you test?

There's an inversion of the sieve that you can use, where you build the list of primes as you discover them, and then you only test with numbers in the list of primes you've already found. You might play around with that, just as a fun variant.
Was This Post Helpful? 2
  • +
  • -

#3 kiwi_steve  Icon User is offline

  • D.I.C Head

Reputation: 31
  • View blog
  • Posts: 109
  • Joined: 26-September 09

Re: Prime test for the sieve of eratosthenes

Posted 19 November 2012 - 11:04 PM

View Postjon.kiparsky, on 20 November 2012 - 05:24 PM, said:

if not test % num:
            count += 1  # When this happens, you know you can return False, right?


ARGH! Of course... I totally missed that. Thanks for pointing it out :)

Quote

There's an inversion of the sieve that you can use, where you build the list of primes as you discover them, and then you only test with numbers in the list of primes you've already found. You might play around with that, just as a fun variant.


I'll look into that once I get this one flying :)

Cheers
Was This Post Helpful? 0
  • +
  • -

#4 kiwi_steve  Icon User is offline

  • D.I.C Head

Reputation: 31
  • View blog
  • Posts: 109
  • Joined: 26-September 09

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 12:18 AM

I'm testing using time.time(). On 20 iterations of primes(10000000) the average time for check_prime() is 6.744 seconds. Modified to return early the time increased to 6.761 seconds! The list comprehension took 6.773 seconds.

Overall I don't see any noticeable difference, and I'm wondering if the interpreter has already worked out it can return early - either that or the function isn't being called often enough to make that much difference. I'm guessing that the higher and higher it gets, the more the first half of the test (which is checking to see if the value has already been zeroed) is tripping that test out. Even just taking out the multiples of 2, 3 and 5 reduces the size of the list by around about 75%:

len([x for x in range(10000000) if (x % 2 and x % 3 and x % 5)])
2666666


I'll do some more tests and timings to see whats happening there...

Cheers

Steve

This post has been edited by kiwi_steve: 20 November 2012 - 12:23 AM

Was This Post Helpful? 0
  • +
  • -

#5 sepp2k  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2117
  • View blog
  • Posts: 3,242
  • Joined: 21-June 11

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 02:35 AM

View Postkiwi_steve, on 20 November 2012 - 05:51 AM, said:

To check if the number is prime, I'm using the following sub-function:


If you're using a function to check each number for primality individually, you're not implementing a sieve. The Sieve of Eratosthenes works by crossing out (i.e. setting to False (or 0, I guess, in your case)) all multiples of the current prime and then continuing with the next number that isn't crossed out.

View Postjon.kiparsky, on 20 November 2012 - 06:24 AM, said:

Interesting use of a listcomp here, but it's a little weird since you're not actually interested in the list. You might also think of using a filter here, but I would think that the for loop offers you a real advantage with the early exit. I imagine you could define a higher-ordered "exists(l,p)" function that would return true if it discovers am element of list l which has property p, but it would probably look a lot like your for loop. (if I'm wrong, of course, I'd love to know about it)


Actually such a function already kind of exists (though technically it only iterates over an iterable and checks whether any element is True, but it's used much in the same way you're envisioning). It's called any. However in this case I would rather use all (which is the obvious counter-part). That would look like this:

return all(num % x > 0 for x in range(3, num, 2))


Btw: The original test was wrong: x % num will never be 0 when x is a number smaller than num. That's also why returning early makes no difference: The test only ever succeeds at the end, so there's no opportunity to return early. (Which just goes to show that optimizing for speed before "optimizing" for correctness, doesn't necessarily produce meaningful results.)

This post has been edited by sepp2k: 20 November 2012 - 02:37 AM

Was This Post Helpful? 1
  • +
  • -

#6 kiwi_steve  Icon User is offline

  • D.I.C Head

Reputation: 31
  • View blog
  • Posts: 109
  • Joined: 26-September 09

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 03:45 AM

View Postsepp2k, on 20 November 2012 - 09:35 PM, said:

View Postkiwi_steve, on 20 November 2012 - 05:51 AM, said:

To check if the number is prime, I'm using the following sub-function:


If you're using a function to check each number for primality individually, you're not implementing a sieve. The Sieve of Eratosthenes works by crossing out (i.e. setting to False (or 0, I guess, in your case)) all multiples of the current prime and then continuing with the next number that isn't crossed out.


Yes, which is why I only test for primality if the number hasn't already been zeroed. Its the first test in line 8 of the main function.

Quote

Btw: The original test was wrong: x % num will never be 0 when x is a number smaller than num. That's also why returning early makes no difference: The test only ever succeeds at the end, so there's no opportunity to return early. (Which just goes to show that optimizing for speed before "optimizing" for correctness, doesn't necessarily produce meaningful results.)


Oops :whistling: yes, got that around the wrong way... I'll rewrite that in the morning when my headache goes away... I've managed to pick up a cold right on the start of summer which is making my head hurt a little so I think I'll try to get some sleep before working on this any more...

Cheers

Steve
Was This Post Helpful? 0
  • +
  • -

#7 sepp2k  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2117
  • View blog
  • Posts: 3,242
  • Joined: 21-June 11

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 04:24 AM

View Postkiwi_steve, on 20 November 2012 - 11:45 AM, said:

View Postsepp2k, on 20 November 2012 - 09:35 PM, said:

If you're using a function to check each number for primality individually, you're not implementing a sieve. The Sieve of Eratosthenes works by crossing out (i.e. setting to False (or 0, I guess, in your case)) all multiples of the current prime and then continuing with the next number that isn't crossed out.


Yes, which is why I only test for primality if the number hasn't already been zeroed. Its the first test in line 8 of the main function.


But you shouldn't test for primality at all. If a number is reached and has not yet been crossed out, it is definitely a prime - no test necessary. That's why the fact that your primality test is broken and always returns True didn't break your program: It is only called for numbers that are prime anyway.

This post has been edited by sepp2k: 20 November 2012 - 04:25 AM

Was This Post Helpful? 3
  • +
  • -

#8 Nallo  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 163
  • View blog
  • Posts: 255
  • Joined: 19-July 09

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 07:03 AM

If it is about speed. A few points:

filter None is faster than listcomprehension with a condition:
Not
...
    sieve[(i - 3) / 2] = 0
final = [x for x in sieve if x > 0]


Faster:
...
    sieve[(i - 3) / 2] = 0
final = filter(None, sieve)



Edited, made a bad comment before.
Edit2: OOps. Jon.Kiparsky already made the argument about filter. Though it was for another part of the program.

This post has been edited by Nallo: 20 November 2012 - 07:59 AM

Was This Post Helpful? 2
  • +
  • -

#9 Nallo  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 163
  • View blog
  • Posts: 255
  • Joined: 19-July 09

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 07:40 AM

I don't like this part:
if sieve[(num - 3) / 2]:  # If its prime, knock out all multiples
        for i in range(num * 2, max + 1, num):
            if i % 2:
                sieve[(i - 3) / 2] = 0


The loop and index computation feels like a real time hog. You have to do it as you decided to have only the odd numbers in your sieve.

But if your sieve started with all numbers including 0 you simply could use slice assignment. Slice assignemet is fast:
sieve = range(max + 1)
...
#some code
...
if sieve[num]:
    number_of_num_multiples = max((len(sieve) - 1)/ num - 1, 0)
    sieve[num * 2::num] = [None] * number_of_num_multiples


Feels to me that taking out the multiples of 2 at the start costs you more than you gain.
Was This Post Helpful? 1
  • +
  • -

#10 kiwi_steve  Icon User is offline

  • D.I.C Head

Reputation: 31
  • View blog
  • Posts: 109
  • Joined: 26-September 09

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 12:32 PM

View Postsepp2k, on 20 November 2012 - 11:24 PM, said:

View Postkiwi_steve, on 20 November 2012 - 11:45 AM, said:

View Postsepp2k, on 20 November 2012 - 09:35 PM, said:

If you're using a function to check each number for primality individually, you're not implementing a sieve. The Sieve of Eratosthenes works by crossing out (i.e. setting to False (or 0, I guess, in your case)) all multiples of the current prime and then continuing with the next number that isn't crossed out.


Yes, which is why I only test for primality if the number hasn't already been zeroed. Its the first test in line 8 of the main function.


But you shouldn't test for primality at all. If a number is reached and has not yet been crossed out, it is definitely a prime - no test necessary. That's why the fact that your primality test is broken and always returns True didn't break your program: It is only called for numbers that are prime anyway.


:stupid: I misread the pseudocode algorithm I used to first understand the sieve. I thought it meant I had to test the first number I came to that hadn't been removed in order to ensure it was prime... I can see now the futility of doing that (at least in low-order numbers - I'll play with that some more to get it clear in my own mind). Thanks for patiently pointing that out until I finally got it :)

View PostNallo, on 21 November 2012 - 02:03 AM, said:

If it is about speed. A few points:

filter None is faster than listcomprehension with a condition


I've never used filter before... it almost seems like cheating :) but then we're not using C, so I'll give it a whirl...

Steve
Was This Post Helpful? 0
  • +
  • -

#11 jon.kiparsky  Icon User is online

  • Pancakes!
  • member icon


Reputation: 7752
  • View blog
  • Posts: 13,110
  • Joined: 19-March 11

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 12:35 PM

Python's no fun unless you use the higher-order stuff. Go for it.
Was This Post Helpful? 0
  • +
  • -

#12 Nallo  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 163
  • View blog
  • Posts: 255
  • Joined: 19-July 09

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 05:35 PM

View Postkiwi_steve, on 20 November 2012 - 07:32 PM, said:

:stupid:/>/>


Nah, you are not. You have to start somewhere and improve :smile2:/>/>

A few years ago I played around with prime number generators and not all efforts (read: very few) were beautiful and/or efficient :withstupid:/>/>/>

This post has been edited by Nallo: 20 November 2012 - 05:36 PM

Was This Post Helpful? 0
  • +
  • -

#13 kiwi_steve  Icon User is offline

  • D.I.C Head

Reputation: 31
  • View blog
  • Posts: 109
  • Joined: 26-September 09

Re: Prime test for the sieve of eratosthenes

Posted 20 November 2012 - 08:33 PM

View PostNallo, on 21 November 2012 - 12:35 PM, said:

View Postkiwi_steve, on 20 November 2012 - 07:32 PM, said:

:stupid:


Nah, you are not. You have to start somewhere and improve :smile2:

Which would be all good if I was just starting - but it was a simple case of RTFM (or not, as the case may be) :turned:
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1