AOCP/Combinatorics
From charlesreid1
Contents
Notes
Knuth AOCP Volume 3: Sorting and Searching: Combinatorics
Permutations and Inversions
Knuth begins talking about sorting by talking about combinatorics and permutations of items.
Start with definition of an inversion:
Let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle a_1 a_2 a_3 \dots a_n} be a permutation of the integers Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle {1 \dots n}} .
If Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle i < j} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle a_i > a_j} , then Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle (a_i, a_j)} is an inversion.
Inversions are out-of-sorts pairs.
Cramer (1750) introduced inversions - utilized to find determinant.
Can also construct an inversion table:
Let Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle b_1 b_2 \dots b_n} denote the inversion table of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle a_1 a_2 \dots a_n} . Then Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle b_j} is the number of elements to the left of j that are greater than j.
Example of sequence and its inversion table:
5 9 1 8 2 6 4 7 3 2 3 6 4 0 2 2 1 0
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle 0 \leq b_1 \leq n-1, 0 \leq b_2 \leq n-2, \dots, 0 \leq b_{n-1} \leq 1 }
Hall (1956) showed that inversion tables uniquely determine permutations - these make inversion tables alternative representations for different permutations.
Transformation technique: turn counting problems into inversion table problems
Now, suppose we want to count number of elements larger than their successor. (This is the number of j such that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle b_j = n-j} ).
Note that this idea is related to nested for loops:
for(int i = 0; i < N; i++ ) {
for(int j = i; j < N; j++) {
We have Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle b_j = n - j}
Since we know the probability that b1 equals n-1 is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle P(b_1 = n-1) = \frac{1}{n}} ,
and independently the probability that b2 equals n-2 is Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle P(b_2 = n-2) = \frac{1}{n-1}} ,
and so on, then we can say:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \dfrac{1}{n} + \dfrac{1}{n-1} + \dots + 1 = H_{n} }
These are the harmonic numbers.
Counting Inversions with Generating Functions
Now, to analyze a sorting algorithm, we are interested in how many permutations of n elements have exactly k inversions. Number of inversions is denoted I, so this number is denoted Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle I_n(k)}
To pose this problem slightly differently: we can think of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle I_n(k)} as a number that is produced by some kind of generating function into which we plug our n and our k, and out pops Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle I_n(k)} .
In fact, we can do this by defining an infinite series polynomial whose kth coefficient is precisely Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle I_n(k)} .
Also note that Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle I_n = \left( \binom{n}{2} - k \right) = I_n(k)} .
We define the generating function Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle G_n(z)} for a sequence containing n elements as:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle G_n(z) = I_n(0) + I_n(1)z + I_n(2) z^2 + \dots = \sum_{k \geq 0} I_n(k) z^k }
Now, we know that when we choose a particular item b as the next element of our sequence Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle b_1 b_2 b_3 \dots b_n} , that choice is independent of all other b's. Another thing we can observe is, there are two possible ways (two possible cases) for a sequence with n elements and k inversions:
- Either we take a sequence of length n-1 and k inversions, and add an item to it that does not change k;
- Or, we take a sequence of length n and k-1 inversions, and we change an item such that we add an inversion k.
From these, we can construct the recursive relationship:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle I_n(k) = I_n(k-1) + I_{n-1}(k) \qquad \mbox{for } k < n }
Knuth then performs some magic, which he says is "not difficult to see," stating:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle G_n(z) = (1 + z + \dots + z^{n-1}) G_{n-1}(z) }
and therefore he is able to simplify the generating function to:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle (1+z+\dots+z^{n-1}) ( \dots )(1+z+z^2+z^3)(1+z+z^2)(1+z)(1) = \dfrac{ (1-z^n)( \dots )(1-z^2)(1-z) }{ (1-z)^n } }
Knuth Goes To Outer Space
It is at this point that Knuth launches into a few pages of extremely difficult to follow material. Individual statements are sensible and logical, but how he gets them and where he's going with all of it is completely unclear...
Start by defining the generating function of n, divided by n factorial, as the generating function for the probability distribution of the number of inversions in a random permutation of n elements.
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle \dfrac{G_n(z)}{n!} = g_n(z) }
Further, let us define the function
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle h_k(z) = \dfrac{(1+z+z^2+\dots+z^{k-1})}{k} }
This is the generating function for the uniform distribution of a random non-negative integer that is less than k.
Now, we can write g in terms of h:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle g_n(z) = h_1(z) h_2(z) \dots h_n(z) }
Next, Knuth uses the following property:
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle E(g_n) = E(h_1) + E(h_2) + \dots + E(h_n) }
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle Var(g_n) = Var(h_1) + Var(h_2) + \dots + Var(h_n) }
(again, completely unclear where he gets this...)
Multisets
See AOCP/Multisets
Related
See also:
Flags
| The Art of Computer Programming notes from reading Donald Knuth's Art of Computer Programming
Part of the 2017 CS Study Plan.
Mathematical Foundations: AOCP/Infinite Series · AOCP/Binomial Coefficients · AOCP/Multinomial Coefficients AOCP/Harmonic Numbers · AOCP/Fibonacci Numbers Puzzles/Exercises:
Volume 2: Seminumerical Algorithms
Volume 3: Sorting and Searching AOCP/Combinatorics · AOCP/Multisets · Rubiks Cube/Permutations
AOCP/Combinatorial Algorithms · AOCP/Boolean Functions AOCP/Five Letter Words · Rubiks Cube/Tuples AOCP/Generating Permutations and Tuples
|
| Combinatorics
Combinatorial Structures - Order Does Not Matter Ordinary generating functions
Labelled Structures - Order Matters Enumerating Permutations: String Permutations Generating Permutations: Cool · Algorithm M (add-one) · Algorithm G (Gray binary code)
Combinatorics Problems Longest Increasing Subsequence · Maximum Value Contiguous Subsequence · Racing Gems Cards (poker hands with a deck of 52 playing cards)
|