Revision as of 10:32, 18 July 2017

This page describes a segmented prime number sieve (i.e., a sieve that can look for prime numbers over an arbitrary interval $$ [M,N] $$ ). This is related to Project Euler/501. Also see Sieve of Eratosthenes.

Theory

Segmented Prime Sieve

The idea behind a segmented prime number sieve is to search for prime numbers over an arbitrary range $$ (M,N) $$ . To properly implement the segmented sieve, we restrict $M > \sqrt{N}$ , although we generate primes less than $\sqrt{N}$ in the process so these can be saved out if this is not the case.

The algorithm is straightforward, but for one component. What we do is, we split the interval $$ [M,N] $$ into sub-segments of size S, so that we search for prime numbers on the intervals $[M, M+S-1], [M+S, M+2S-1], \dots, [N-S, N]$ . We then generate a list of prime numbers between $$ 2 $$ and $\sqrt{N}$ using a regular Sieve of Eratosthenes algorithm. Once we have that list, we can iterate through each sub-segment, and for each sub-segment we eliminate any numbers that happens to be a multiple of that prime, using a bit vector technique similar to the Sieve of Eratosthenes. For example, if we are running a segmented sieve for the interval $$ [1000, 1100] $$ , we have:

$\sqrt{1100} \approx 33.166$

So we run a Sieve of Eratosthenes to find all primes up to the nearest floored integer, 33. Now, for this list of numbers, we want to eliminate any numbers between $$ [1000,1100] $$ that are multiples of 2, then any that are multiples of 3, then any that are multiples of 5, and so on.

Let's keep things simple and consider a single sub-segment of size 100 that covers the entire segment. Then after crossing out multiples of 2, 3, and 5, we have:

After running through this process with each prime in the range $[2,\sqrt{N}]$ (optionally stopping when you reach $\sqrt{M+S}$ or the upper boundary of the segment, if the segment is particularly large), the primes can be collected.

The Subtlety: Starting Points

The algorithm described above is great, but for one thing: how can we determine, quickly, where to begin crossing out factors?

In other words, let's suppose we're on the step where we want to cross out factors of 7 in our sub-segment. We want to start at some multiple of 7 in the bit vector; then we walk 7 elements and cross out the number at that position; and we repeat that process until we get to the endof the sub-segment.

But finding the start position takes care. If we start at the first number 1,000, we eliminate $1,000; 1,007; 1,014; \dots$ , which are not multiples of 7. In fact, we need to start at 1,001, which is the first multiple of 7 in our sub-segment: $7 \times 143 = 1,001$ .

To generalize this, we can use a formula which forms the core of the segmented sieve algorithm:

	protected static int qInit(long left, int p) {
		// first term here is because modOfThis must be positive.
		long modOfThis = (left/2)*p - (left+1+p)/2;
		int result = (int)(modOfThis%((long)p));
		return result;
	}

	protected static int qReset(int p, int q, int delta) {
		int result = (q-delta)%p;
		int j = 1;
		while(result < 0) { 
			result += j*p;
		}
		return result;
	}

Code

PrimeGenerator Class

The following class implements the segmented Sieve of Eratosthenes, together with the segmented Sieve of Eratosthenes, to create a PrimeGenerator object that continually generates new prime numbers:

import java.math.BigDecimal;
import java.util.List;
import java.util.Queue;
import java.util.LinkedList;
import java.util.ArrayList;

/** Prime number generator class 
 * implemented using Longs.
 *
 * Uses a segmented sieve of Eratosthenes
 * algorithm.
 */
public class PrimeGeneratorLong { 

	/**
	 * Prime Generator class implementation.
	 */

	private Queue<Long> q; // The queue stores the prime numbers this generator has found
	private long max;
	private long lo, hi;
	private int rootmax;

	private int cycle;

	public PrimeGeneratorLong(long max) {
		this.max = max;
		this.rootmax = (int)(Math.sqrt(this.max));

		// Making rootmax even makes life easier.
		if(rootmax%2!=0) {
			rootmax++;
		}

		this.q = new LinkedList<Long>();

		this.lo = 1;
		this.hi = 1;

		// Look ahead. This enables hasNext() check.
		repopulateQueue();
	}

	public void reset() {
		this.q = new LinkedList<Long>();
		this.lo = 1;
		this.hi = 1;
		repopulateQueue();
	}

	public Long next() { 
		Long result = q.remove();

		// Look ahead. This enables hasNext() check.
		if(q.isEmpty()) {
			repopulateQueue();
		}

		return result;
	}



	/** Repopulate queue with prime number starting at lo and going to hi. */
	public void repopulateQueue() { 

		int ACCORDION = (int)(Math.round(Math.sqrt(rootmax)));

		// Value of ACCORDION*sqrt(MAX) is the stride 
		// of our segmented Sieve of Eratosthenes algorithm.
		if(lo==1) {
			this.cycle = 1;
			// First time through,
			// use normal Sieve of Eratosthenes.

			// Add everything up to sqrt(MAX).
			// This ensures we don't call segmented Sieve of Eratosthenes 
			// when low < sqrt(hi), because we already begin 
			// with low above sqrt(MAX).
			q.addAll( primeSieve(rootmax) );

			// set new lo and hi.
			// lo must be even. 
			// only need to do this once.
			lo = rootmax;
			hi = rootmax + ACCORDION*rootmax;

		} else {
			this.cycle++;

			// Every other time through,
			// use segmented Sieve of Eratosthenes.
			q.addAll( primesRange(lo, hi) );
			if(q.isEmpty()) {
				// This is only a problem with small integers.
				// But it's still a problem.
				hi *= 2;
				q.addAll( primesRange(lo, hi) );
			}

			lo = hi;
			hi = hi + ACCORDION*rootmax;
		}
	}




	/** Find all prime numbers between left and right as longs. 
	 * */
	public static List<Long> primesRange(long left, long right) { 
		if(left%2!=0 || right%2!=0) { 
			throw new IllegalArgumentException("Error calling primesRange(): left and right must be even.");
		}

		// Return this
		ArrayList<Long> result = new ArrayList<Long>();

		// Ensure that we get precisely the correct range of primes.
		int delta;
		delta = (int)(right-left);
		delta /= 2;
		delta += 1;

		boolean[] sieve = new boolean[delta-1];

		// could make these longs... if sqrt(right) < 10^18, which, obviously.
		// could make these ints... if sqrt(right) < 10^9, which... also has to be.
		// these compromises are needed to keep the class working.
		// 
		// ...let future self deal with those and hack away at this
		// when we actually get there. For now, just get this done.
		//
		List<Integer> Ps = primeSieveInt( (int)(Math.sqrt(right)) );

		// Skip 2
		Ps.remove(Ps.indexOf(2));

		// Create Qs starting index vector
		List<Integer> Qs = new ArrayList<Integer>();
		for(int i=0; i<Ps.size(); i++ ) {
			Qs.add( qInit(left, Ps.get(i)) );
		}

		// Loop over each cluster
		while(left<right){

			// Assume everybody prime
			for(int k=0; k<(delta-1); k++) { 
				sieve[k] = true;
			}

			// Cross out multiple of each prime
			for(int i=0; i<Ps.size(); i++) {
				int p = Ps.get(i);
				int q = Qs.get(i);

				for(int j=q; j<(delta-1); j+=p) { 
					sieve[j] = false;
				}
			}

			// Reset for next cluster
			for(int k=0; k<Ps.size(); k++ ) {
				Qs.set(k, qReset(Ps.get(k), Qs.get(k), delta) );
			}

			// Have knocked out non-primes, so gather the primes
			long t = left + 1;
			for(int i=0; i<delta-1; i++) { 
				// t ends at right
				if(sieve[i]) { 
					result.add( new Long(t) );
				}
				t += 2;
			}
			left += 2*delta; 
		}
		return result;
	}

	protected static int qInit(long left, int p) {
		// modOfThis must be positive.
		//long modOfThis = (((left-1)*(p-1))/2)-1;
		long modOfThis = (left/2)*p - (left+1+p)/2;
		int result = (int)(modOfThis%((long)p));
		return result;
	}


	protected static int qReset(int p, int q, int delta) {
		int result = (q-delta)%p;
		int j = 1;
		while(result < 0) { 
			result += j*p;
		}
		return result;
	}




	/** Find all prime numbers below n.
	 *
	 * Return them as Longs.
	 */
	public static List<Long> primeSieve(int n) { 

        // initially assume all integers are prime 
        boolean[] isPrime = new boolean[n+1];
        for (int i = 2; i <= n; i++) {
            isPrime[i] = true;
        }

        // mark non-primes <= n using Sieve of Eratosthenes
        for (int factor = 2; factor*factor <= n; factor++) {

            // if factor is prime, then mark multiples of factor as nonprime
            // suffices to consider mutiples factor, factor+1, ...,  n/factor
            if (isPrime[factor]) {
                for (int j = factor; factor*j <= n; j++) {
                    isPrime[factor*j] = false;
                }
            }
        }

		// Add primes to list
		ArrayList<Long> primes = new ArrayList<>();
		int nprimes = 0;
		for(int k=2; k<=n; k++) {
			if(isPrime[k]) { 
				primes.add( new Long(k) );
				nprimes++;
			}
		}

		return primes;
	}



	/** Find all prime numbers below n.
	 *
	 * Return then as Integers.
	 */
	public static List<Integer> primeSieveInt(int n) { 

        // initially assume all integers are prime 
        boolean[] isPrime = new boolean[n+1];
        for (int i = 2; i <= n; i++) {
            isPrime[i] = true;
        }

        // mark non-primes <= n using Sieve of Eratosthenes
        for (int factor = 2; factor*factor <= n; factor++) {

            // if factor is prime, then mark multiples of factor as nonprime
            // suffices to consider mutiples factor, factor+1, ...,  n/factor
            if (isPrime[factor]) {
                for (int j = factor; factor*j <= n; j++) {
                    isPrime[factor*j] = false;
                }
            }
        }

		// Add primes to list
		ArrayList<Integer> primes = new ArrayList<>();
		int nprimes = 0;
		for(int k=2; k<=n; k++) {
			if(isPrime[k]) { 
				primes.add( new Integer(k) );
				nprimes++;
			}
		}

		return primes;
	}




	/** The prime generator is agnostic to the maximum;
	 * we use hasNext to determine when to stop. 
	 * Change this to true for an infinite prime generator.
	 * */
	public boolean hasNext() { 
		// This is possible because we always ensure there is something in the queue
		// before we finish operations on the queue.
		if(q.peek().compareTo(max)>0){ 
			return false;
		} else {
			return true;
		}
	}

}

Flags

Segmented Sieve: Difference between revisions

From charlesreid1