Subscribe to ...considered harmful

## How to think about finding primes

Finding primes is a common exercise in programming, and for good reason: it's a good exercise that combines mathematical thinking with programmatic thinking. The exercise comes in various forms, most commonly "find the nth prime" or "find the first n primes". Also common is "find all primes less than n". It turns out that these are all basically the same problem.

For a new programmer, particularly a new programmer not used to doing advanced math, this can be a tricky problem, and it's one of the most common questions I see here at DIC. So in this post, I'll try to give an overview of the mathematical thinking which underlies the problem. There will be no actual code here - that's your job, as the programmer - but I hope that working through the math will make it very obvious how the code needs to work. The math here is pretty much low-level number theory. I assume no knowledge of the theorems of number theory, just some basic arithmetic. I leave two problems as exercises. Depending on where you are with your math, these may be difficult or not so very much so, but they're meant to be solvable, so don't give up if you find they're tricky.

In particular, I'm going to consider the workings of a function that we can call isPrime(n). This function should return true if n is prime, and false otherwise. Here's how I think of it: By definition, a number n is prime if it has no divisors d such that 1 < p < n. We wish to test this by testing all relevant potential divisors d to see if they are actually divisors of n. The simplest way to do this is the exhaustive search, as suggested in this pseudocode:

```for each p in the range 2..n-1 (inclusive)
if n % p == 0, return False, since we have found a divisor of n, namely p, such that 1< d<p.
if no p causes us to return False, then return True, since we have tried all potential divisors and none were actually divisors

```

This turns out to be inefficient: we have to test all numbers from 2 through n-1, and we know that many of them can be rejected from consideration out of hand. For example, if n == 100, we know that no number in the range 50 .. 99 is a possible contender, since 2 * 50 = 100, and there is no integer less than 2. (so no divisor for 100 can exist in the range 50 < p < 100) More than this, we know that when we find a divisor p for n, we have also implicitly found a second divisor n = q/p. This is what it means to be a divisor. And we can pair these up in this way: we can generate a list [(p1, q1), (p2, q2)...] such that all pairs (p, q) where pq = n are represented. If we consider such a list, sorted such that the ps are in ascending order, we find there's a pattern to the qs: namely, they are in descending order.

If you look at this for a little while, you'll see that there's a "multiplicative midpoint" (m) for any number n, such that any pair of divisors (p,q) obey this inequality: p <=m, q >=m. Can you find that midpoint m? If so, you'll have found the limit for your exhaustive search: there is no need to test divisors greater than n, since any such divisor will be a q, and matched with another divisor p which you will have already found.

So all of this implies that your loop can be bounded as

```while p <= m:
```

I'll leave it to you to discover what m must be, because it's fun to find things like that and I don't want to spoil the fun.

Another inefficiency in the naive algorithm is that we test a lot of numbers we don't need to test, even in that range 1<p <=m

I claim that this theorem is true:

for any number n, if r divides n, and p divides r, then p divides n.

For example, if 6 divides 12 and 2 divides 6, then 2 divides 12.

This implies that there's a particular set of numbers (P) that we can select our candidate divisors p from, and this will be sufficient to establish primality of n. I'll let you consider what that set P is - the answer is surprisingly obvious!

### 0 Comments On This Entry

There are no Trackbacks for this entry

S M T W T F S
12
3456789
10111213141516
17181920 21 2223
24252627282930
31