From charlesreid1

Revision as of 18:33, 24 March 2019 by Admin (talk | contribs) (→‎Redux)

Breakdown

This was an extremely interesting problem. When I first read through the problem, the solution came to me almost immediately - if we want to know how many divisors a number has, it relates to its prime factorization, and if we want a number with eight divisors, we want a number whose prime factors can be rearranged in exactly 4 distinct ways (leading to 8 pairings, or 8 factors).

This is confirmed by examining the examples the problem gives. Several integers that are below 100 and have exactly 8 factors were given. The prime factorizations of each number followed certain patterns. These patterns are:

$ p_1 p_2 p_3 $

and

$ p_A^3 p_B $

Why these patterns? It's because these are the two kinds of groupings of objects that lead to exactly four ways of partitioning the objects. The first is if we have 3 distinct objects, which can be grouped in 4 distinct ways:

A B C |
A B | C
A | B C
A C | B

and if we have 3 identical items and one unique item, these can also be arranged in exactly four ways:

s s s T | 
s s s | T
s s | s T
s | s s T

The problem statement gives two cases where this happens, and leaves you stumped when your code gives you 179 numbers with exactly 8 divisors, instead of the information given in the problem, which is 180. I struggled over this for a while. I thought it might be a bug in my code, but typically these kinds of bugs are symmetric in some way, so I would be leaving out two or more numbers with 8 divisors.

The next thing I checked was the form of the numbers I was generating, which was either the product of 3 distinct primes:

$ p_1 p_2 p_3 $

and

$ p_A^3 p_B $

I was generating every combination of these numbers for every prime below the maximum, in this case 1000, and things seemed to be working out okay, except that I was missing one. When I ran it on the case of a 1000000 maximum, I was off from the information given in the problem by four. The challenge, of course, is that when I spot checked my list of 179 numbers, all of them had exactly 8 divisors - it wasn't that I was generating too many numbers. It was that some number not included on my list had 8 divisors, and I had to find it.

This was a cryptic problem. I tried investigating whether the primes I was using could, in fact, be composite numbers. So, if I substitute 4 for pA and 6 for pB, does that result in a number with 8 factors? The answer, of course, is no. These definitely need to be primes. What if pA and pB, or p1 and p2 and p3, are not distinct primes? This led to more numbers with fewer or more than 8 factors.

I tried pA = 2 and pB = 2, which led to 2^4 or 16, which did not have 8 factors. I tried p1 = 4 and p2 = 2 and p3 = 4, totaling 2^5 or 32, which also did not have 8 factors. Then I tried pA = 4 and pB = 2, for a total of 2^7 or 128, and to my surprise, 128 had 8 factors. Here was the missing prime! 128 was not in the list of primes taking the two forms given above. However, the procedure for finding 2^7 and the rule for it are a bit different. The rule relates to how we can use the prime numbers to construct new numbers, as we saw in Problem 500, the prior problem.

(To give it away: I was missing a prime of the form $ p_i^7 $. More details below.)

Generating composites from primes

Let's go back to that problem for a moment. There, we were trying to construct numbers with a specified number of divisors. To do that, we constructed a table with the prime numbers, with columns arranged in a particular way: n, n^2, n^4, n^8, and so on. Then, to construct new numbers, we found new combinations of numbers from each column. The reason we have only powers of 2 for these primes is, any number can be expressed as the sum of powers of 2 - i.e., can be represented in binary. if we had an odd power of n, say n^5, we could write in terms of powers of 2 like: (n)(n^4). Likewise for n^13, we could write it as (n)(n^4)(n^8).

That is, any integer power of n can be written as a binary number, and that binary number yields the order of selection of the primes from the table.

Generating composites with 8 divisors

Back to our problem at hand. We are looking for a third combination of prime numbers that will give us numbers with 8 divisors. If, instead of using primes in the above expressions, we instead use even powers of primes, we get a whole new class of numbers with 8 divisors, which do not show up until at least 128.

For example, a number with 8 divisors can be formed by letting the primes be distinct powers of the prime 2:

$ \begin{align} p_1 &=& 2 \\ p_2 &=& 2^2 \\ p_3 &=& 2^4 \end{align} $

That forms the candidate number:

$ c = 2^{4+2+1} = 128 $

which, indeed, has 8 divisors:

WolframAlpha 128 8factors.png

We were only off by 1 (our program predicted 179 numbers with 8 factors below 1,000, instead of the problem statement's 180) below 1,000 because the next such candidate number is:

$ c = 3^{4+2+1} = 2187 $

which has 8 divisors:

WolframAlpha 2187 8factors.png

and here are the values of the seventh power of the first few primes:

2		128
3		2187
5		78125
7		823543
11		19487171
13		62748517
17		410338673
19		893871739

which all have 8 divisors:

WolframAlpha 893871739 8factors.png

All of which is to say, in the process of generating numbers with 8 divisors, we need to include numbers of the form $ p_r^7 $.

Procedure

Now, the procedure is as follows:

  • Generate various combinations of three primes $ p_1, p_2, p_3 $ whose product is less than the maximum; those numbers are numbers with 8 divisors of the form $ p_1 p_2 p_3 $. The range for these numbers should be all prime numbers from 2 to max/6.
  • Generate combinations of two primes $ p_A p_B $, and generate the numbers $ p_A^3 p_B $ and $ p_A p_B^3 $. Keep the products that are less than the maximum; those numbers each have 8 divisors. The range for these numbers should be all prime numbers from 0 to the square root of max.
  • Generate primes $ p_r $ and corresponding numbers $ p_r^7 $. Keep results that are less than the maximum; those numbers each have 8 divisors. The range for these numbers should be from 0 to the seventh root of max (square root of square root, if you're lazy).

Once we do that, we're all set. Indeed, this procedure gets me correct answers for all numbers up to 1000 that have 8 divisors, and also gives me the correct answer for all numbers up to 1,000,000 that have 8 factors. It's also blazing fast, finding the correct number of composites up to 1 million with 8 divisors and returning it in less than half a second:

$ javac Problem501.java && time java Problem501
Finished with first part. On to second part...
Finished with second part. On to third part...

Number of integers with exactly 8 divisors:
224427

real	0m0.286s
user	0m0.512s
sys	0m0.052s

But when we raise the ceiling to 1 trillion, it chokes.

Not So Fast, Eratosthenes

So, what's the problem here?

The problem is that we are trying to count to a trillion - possibly several times.

Why are we trying to count to a trillion? We're trying to assemble numbers with 8 divisors, and one possible form of these numbers is:

$ p_1 p_2 p_3 $

where the p's are arbitrary prime numbers. So, the worst case scenario is that we have two very small primes like 2 or 3, and the last prime is somewhere around a trillion divided by six.

To put a trillion into perspective, the typical computer can do a billion operations per second. Doing anything interesting takes around a hundred instructions, so that leaves us with a million operations per second. A simple back of the envelope tells you that you're going to be waiting

$ \dfrac{10^{12} \mbox{ ops}}{ 10^6 \mbox{ ops/s}} = 10^6 \mbox{ s} \sim 10 \mbox{ days} $

And that's optimistic. A factor of three could ruin your month.

So, our approach with a ceiling of 1,000 or 1,000,000 much simpler than our approach with a ceiling of 1,000,000,000,000, because suddenly we're doing whatever we were doing before, but a million times. Even the smallest inefficiencies will snowball.

Solving this problem requires a couple of tools, beginning with a prime number sieve that can start at an arbitrary location, instead of starting at 2. The term for this type of prime number sieve is a segmented Sieve of Eratosthenes; it is a prime number sieve taking lower and upper bound arguments.

Because this is a combinatorics search problem, we also need to spend additional time analyzing the problem space and finding some elegant shortcuts to save ourselves time.

Segmented Sieve

See Segmented Sieve

A Long Hiatus

(We let this problem sit and stew for a few years without coming back to it)

Redux

On revisiting Project Euler Problem 501, here's the crux of the problem as we left it:

  • We want to count the number of integers less than a trillion that have exactly 8 divisors.
  • We know that such numbers can take several forms, involving prime factors raised to various powers. Specifically:


Because these prime factors can be any prime numbers, the worst case scenario is that we have an integer very close to a trillion where the first two prime factors are 2 and 3, meaning we have to account for primes very close to a trillion.

And that means (using the approach above) we would have to GENERATE all primes below a trillion (or, a trillion divided by six, if you want to split hairs). That's totally out.

But here's the thing: we don't have to GENERATE the prime numbers, we just have to count them. So if we abandon the Sieve of Eratosthenes, which actually generates the prime numbers, we might be able to find a function that gives us less information (a count of prime numbers less than a trillion, instead of the prime numbers themselves) with the tradeoff that it goes much faster.

Is It Feasible?

Take a look at the MathWorld site on prime counting functions: http://mathworld.wolfram.com/PrimeCountingFunction.html

This states that Mathematica can count primes up to approximately

$ 8 \times 10^{13} $

Looking at the table given on that page, we can actually just read off the number of primes less than a trillion, which is

$ 37,607,912,018 \sim 3.7 \times 10^{10} $

So we're in safe territory - we won't have to outperform Mathematica to get the answer to this question.

Why you can't use a table value

The easiest solution would be to just read the value of $ \pi(x) $ for x = 1 trillion, then boom, voila, we just have to figure out how many ways to configure those numbers there are.

But that's not possible, because we need to evaluate the number of primes less than (some arbitrary numbers).

For example, the number of primes less than the seventh root of n.

Quick Sympy Test

$ python
Python 3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sympy
>>> sympy.ntheory.generate.primepi(1e6)
78498
>>> sympy.ntheory.generate.primepi(1e7)
664579
>>> sympy.ntheory.generate.primepi(1e8)
5761455
>>> sympy.ntheory.generate.primepi(1e9)
50847534
>>> sympy.ntheory.generate.primepi(1e10)
455052511

These take mere seconds to evaluate. This one takes a little longer, but less than 1 minute, showing that this approach is absolutely feasible:

>>> sympy.ntheory.generate.primepi(1e11)
4118054813

Links

The wikipedia article on the prime counting function is very high quality and provides a good overview of the prime counting function: https://en.wikipedia.org/wiki/Prime-counting_function

This math stack exchange answer gives some nice references, including papers and implementations: https://math.stackexchange.com/a/893767

Prime counting codes:

Flags