From charlesreid1

Line 251: Line 251:
===Form 2===
===Form 2===


Using the fact that <math>\sqrt[3]{10^{12}} = 10^4</math>, we know that the cubed prime can only be a prime number below <math>10,000</math> and we know that the non-cubed prime can only be a prime number below <math>\frac{10^{12}}{10^4} = 10^6</math>. These are relatively small search spaces and we can generate these lists of primes very fast.
Numbers with eight divisors of Form 2 have the form:
 
<math>
p_A^3 p_B
</math>
 
We can put an upper bound on the cubed prime by noting that it MUST be less than the cube root of a trillion:
 
<math>
\sqrt[3]{10^{12}} = 10^4
</math>
 
Therefore, we know that the candidates for the cubed prime number must be below <math>10,000</math>. In fact, because the second prime is at least 2, we actually know the candidates for the cubed prime number must be below <math>5,000</math>. This greatly restricts our search space. But this tells us nothing about the second prime, so the search space for the second prime is going to be enormous.
 
Here is an example of a best case scenario, where the cubed prime is in the vicinity of 5,000 and the non-cubed prime is a small number:
 
<math>
1120281611913 = 4993^3 \times 9
</math>


===Form 1===
===Form 1===

Revision as of 19:39, 24 March 2019

Breakdown

This was an extremely interesting problem. When I first read through the problem, the solution came to me almost immediately - if we want to know how many divisors a number has, it relates to its prime factorization, and if we want a number with eight divisors, we want a number whose prime factors can be rearranged in exactly 4 distinct ways (leading to 8 pairings, or 8 factors).

This is confirmed by examining the examples the problem gives. Several integers that are below 100 and have exactly 8 factors were given. The prime factorizations of each number followed certain patterns. These patterns are:

$ p_1 p_2 p_3 $

and

$ p_A^3 p_B $

Why these patterns? It's because these are the two kinds of groupings of objects that lead to exactly four ways of partitioning the objects. The first is if we have 3 distinct objects, which can be grouped in 4 distinct ways:

A B C |
A B | C
A | B C
A C | B

and if we have 3 identical items and one unique item, these can also be arranged in exactly four ways:

s s s T | 
s s s | T
s s | s T
s | s s T

The problem statement gives two cases where this happens, and leaves you stumped when your code gives you 179 numbers with exactly 8 divisors, instead of the information given in the problem, which is 180. I struggled over this for a while. I thought it might be a bug in my code, but typically these kinds of bugs are symmetric in some way, so I would be leaving out two or more numbers with 8 divisors.

The next thing I checked was the form of the numbers I was generating, which was either the product of 3 distinct primes:

$ p_1 p_2 p_3 $

and

$ p_A^3 p_B $

I was generating every combination of these numbers for every prime below the maximum, in this case 1000, and things seemed to be working out okay, except that I was missing one. When I ran it on the case of a 1000000 maximum, I was off from the information given in the problem by four. The challenge, of course, is that when I spot checked my list of 179 numbers, all of them had exactly 8 divisors - it wasn't that I was generating too many numbers. It was that some number not included on my list had 8 divisors, and I had to find it.

This was a cryptic problem. I tried investigating whether the primes I was using could, in fact, be composite numbers. So, if I substitute 4 for pA and 6 for pB, does that result in a number with 8 factors? The answer, of course, is no. These definitely need to be primes. What if pA and pB, or p1 and p2 and p3, are not distinct primes? This led to more numbers with fewer or more than 8 factors.

I tried pA = 2 and pB = 2, which led to 2^4 or 16, which did not have 8 factors. I tried p1 = 4 and p2 = 2 and p3 = 4, totaling 2^5 or 32, which also did not have 8 factors. Then I tried pA = 4 and pB = 2, for a total of 2^7 or 128, and to my surprise, 128 had 8 factors. Here was the missing prime! 128 was not in the list of primes taking the two forms given above. However, the procedure for finding 2^7 and the rule for it are a bit different. The rule relates to how we can use the prime numbers to construct new numbers, as we saw in Problem 500, the prior problem.

(To give it away: I was missing a prime of the form $ p_r^7 $. More details below.)

Generating composites from primes

Let's go back to that problem for a moment. There, we were trying to construct numbers with a specified number of divisors. To do that, we constructed a table with the prime numbers, with columns arranged in a particular way: n, n^2, n^4, n^8, and so on. Then, to construct new numbers, we found new combinations of numbers from each column. The reason we have only powers of 2 for these primes is, any number can be expressed as the sum of powers of 2 - i.e., can be represented in binary. if we had an odd power of n, say n^5, we could write in terms of powers of 2 like: (n)(n^4). Likewise for n^13, we could write it as (n)(n^4)(n^8).

That is, any integer power of n can be written as a binary number, and that binary number yields the order of selection of the primes from the table.

Generating composites with 8 divisors

Back to our problem at hand. We are looking for a third combination of prime numbers that will give us numbers with 8 divisors. If, instead of using primes in the above expressions, we instead use even powers of primes, we get a whole new class of numbers with 8 divisors, which do not show up until at least 128.

For example, a number with 8 divisors can be formed by letting the primes be distinct powers of the prime 2:

$ \begin{align} p_1 &=& 2 \\ p_2 &=& 2^2 \\ p_3 &=& 2^4 \end{align} $

That forms the candidate number:

$ c = 2^{4+2+1} = 128 $

which, indeed, has 8 divisors:

WolframAlpha 128 8factors.png

We were only off by 1 (our program predicted 179 numbers with 8 factors below 1,000, instead of the problem statement's 180) below 1,000 because the next such candidate number is:

$ c = 3^{4+2+1} = 2187 $

which has 8 divisors:

WolframAlpha 2187 8factors.png

and here are the values of the seventh power of the first few primes:

2		128
3		2187
5		78125
7		823543
11		19487171
13		62748517
17		410338673
19		893871739

which all have 8 divisors:

WolframAlpha 893871739 8factors.png

All of which is to say, in the process of generating numbers with 8 divisors, we need to include numbers of the form $ p_r^7 $.

Procedure

Now, the procedure is as follows:

  • Generate various combinations of three primes $ p_1, p_2, p_3 $ whose product is less than the maximum; those numbers are numbers with 8 divisors of the form $ p_1 p_2 p_3 $. The range for these numbers should be all prime numbers from 2 to max/6.
  • Generate combinations of two primes $ p_A p_B $, and generate the numbers $ p_A^3 p_B $ and $ p_A p_B^3 $. Keep the products that are less than the maximum; those numbers each have 8 divisors. The range for these numbers should be all prime numbers from 0 to the square root of max.
  • Generate primes $ p_r $ and corresponding numbers $ p_r^7 $. Keep results that are less than the maximum; those numbers each have 8 divisors. The range for these numbers should be from 0 to the seventh root of max (square root of square root, if you're lazy).

Once we do that, we're all set. Indeed, this procedure gets me correct answers for all numbers up to 1000 that have 8 divisors, and also gives me the correct answer for all numbers up to 1,000,000 that have 8 factors. It's also blazing fast, finding the correct number of composites up to 1 million with 8 divisors and returning it in less than half a second:

$ javac Problem501.java && time java Problem501
Finished with first part. On to second part...
Finished with second part. On to third part...

Number of integers with exactly 8 divisors:
224427

real	0m0.286s
user	0m0.512s
sys	0m0.052s

But when we raise the ceiling to 1 trillion, it chokes.

Not So Fast, Eratosthenes

So, what's the problem here?

The problem is that we are trying to count to a trillion - possibly several times.

Why are we trying to count to a trillion? We're trying to assemble numbers with 8 divisors, and one possible form of these numbers is:

$ p_1 p_2 p_3 $

where the p's are arbitrary prime numbers.

The best case scenario is that these three primes are all around the same size, the cube root of our max (the cube root of a trillion is only 1,000).

But worst case scenario is that we have two very small primes like 2 or 3, and the last prime is an integer near 1000000000000/6. That means that to FIND those primes, we're going to be running a number sieve and actually counting to almost a trillion.

To put a trillion into perspective, the typical computer can do a billion operations per second. Doing anything interesting takes around a hundred instructions, so that leaves us with around ten million operations per second. A simple back of the envelope tells you that you're going to be waiting

$ \dfrac{10^{12} \mbox{ ops}}{ 10^7 \mbox{ ops/s}} = 10^5 \mbox{ s} \sim 1 \mbox{ day} 4 \mbox{ hours} \dots $

Keep in mind, this is optimistic. A factor of 10 could ruin your week.

This means our approach to the problem with a maximum of a million is much simpler than our approach with a maximum of a trillion, because suddenly we're doing everything we were doing before, a million times. Even the smallest inefficiencies will snowball.

Solving this problem requires a couple of tools, beginning with a prime number sieve that can start at an arbitrary location, instead of starting at 2. The term for this type of prime number sieve is a segmented Sieve of Eratosthenes; it is a prime number sieve taking lower and upper bound arguments.

Because this is a combinatorics search problem, we also need to spend additional time analyzing the problem space and finding some elegant shortcuts to save ourselves time.

Segmented Sieve

See Segmented Sieve

A Long Hiatus

(We let this problem sit and stew for a few years without coming back to it)

Redux

On revisiting Project Euler Problem 501, here's the crux of the problem as we left it:

  • We want to count the number of integers less than a trillion that have exactly 8 divisors.
  • We know that such numbers can take several forms, involving prime factors raised to various powers. Specifically:

Form 1:

$ p_A p_B p_C $

Form 2:

$ p_1^3 p_2 $

Form 3:

$ p_r^7 $

  • A naive approach will work with a maximum around a million, but gets swamped by calculations as we move max into the billions range.
  • The real problem has a maximum of a trillion, so we have to re-think our approach.

We summarize the solution procedure for each of the three forms below.

Form 3

Numbers with eight divisors of Form 3 are of the form:

$ p_r^7 $

Form 3 is definitely the easiest case.

We start by using the fact that $ \sqrt[7]{10^{12}} \sim 51 $, which tells us that any prime number below 52 will generate a number of Form 3 that is less than a trillion.

Since each prime below 52 will yield one Form 3 number less than a trillion, we just need to count the number of prime numbers less than 52, which happens to be:

$ \pi(52) = 15 $

where $ \pi $ is the prime-counting function (the wikipedia article gives an excellent introduction, wolfram mathworld provides additional references).

We can do this in Python using Sympy:

$ python
Python 3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sympy
>>> sympy.ntheory.generate.primepi(52)
15

Therefore there are exactly 15 numbers with eight divisors that have Form 3.

We're nearly done! Just kidding.

Form 2

Numbers with eight divisors of Form 2 have the form:

$ p_A^3 p_B $

We can put an upper bound on the cubed prime by noting that it MUST be less than the cube root of a trillion:

$ \sqrt[3]{10^{12}} = 10^4 $

Therefore, we know that the candidates for the cubed prime number must be below $ 10,000 $. In fact, because the second prime is at least 2, we actually know the candidates for the cubed prime number must be below $ 5,000 $. This greatly restricts our search space. But this tells us nothing about the second prime, so the search space for the second prime is going to be enormous.

Here is an example of a best case scenario, where the cubed prime is in the vicinity of 5,000 and the non-cubed prime is a small number:

$ 1120281611913 = 4993^3 \times 9 $

Form 1

Form 1 presents the greatest difficulties. To find numbers with eight divisors of Form 1, we have a multivariate problem space (3 variables) to search, where the values only take on specific values (the primes), whose values cannot be predicted analytically, and where the product of those three variables must not exceed some maximum.

These prime number factors can be as large as (in the first case) a trillion divided by 6, which really doesn't cut the problem size down by much. Here's an example of an integer that has exactly eight factors:

$ 999,999,999,762 = 2 \times 3 \times 166,666,666,627 $

Whose eight divisors are:

$ 1 \quad 2 \quad 3 \quad 6 \quad 166666666627 \quad 333333333254 \quad 499999999881 \quad 999999999762 $



Quick Sympy Test

$ python
Python 3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sympy
>>> sympy.ntheory.generate.primepi(1e6)
78498
>>> sympy.ntheory.generate.primepi(1e7)
664579
>>> sympy.ntheory.generate.primepi(1e8)
5761455
>>> sympy.ntheory.generate.primepi(1e9)
50847534
>>> sympy.ntheory.generate.primepi(1e10)
455052511

Links

The wikipedia article on the prime counting function is very high quality and provides a good overview of the prime counting function: https://en.wikipedia.org/wiki/Prime-counting_function

This math stack exchange answer gives some nice references, including papers and implementations: https://math.stackexchange.com/a/893767

Prime counting codes:

Flags