From charlesreid1

 
(11 intermediate revisions by the same user not shown)
Line 4: Line 4:


Write a function to find all permutations of the letters in a particular string (example: HELLO).
Write a function to find all permutations of the letters in a particular string (example: HELLO).
==Enumerating==


===Enumerating - Easy Case===
===Enumerating - Easy Case===
Line 64: Line 66:


<math>
<math>
\binom{n}{a, b, c, d} = \binom{n}{a} \binom{n-a}{b} \binom{n-a-b}{c} \binom{n-a-b-c}{d}
\binom{n}{a, b, c, d} = \binom{n}{a} \binom{n-a}{b} \binom{n-a-b}{c} \binom{n-a-b-c}{d} = \dfrac{n!}{a! b! c! d!}
</math>
</math>


Line 84: Line 86:
\binom{n}{r_1, r_2, \dots, r_k} = \dfrac{n!}{r_1! r_2! \dots r_k!}
\binom{n}{r_1, r_2, \dots, r_k} = \dfrac{n!}{r_1! r_2! \dots r_k!}
</math>
</math>


===Enumerating - Infinite Letters Case===
===Enumerating - Infinite Letters Case===
Line 90: Line 91:
Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.
Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.


We can still treat this as a multiset problem, but this time the number of characters is infinite. The multiset is denoted:
We can still treat this as a multiset problem, but this time the number of characters is infinite. This multiset is denoted:


<math>
<math>
\{ \infty \cdot A, \infty \cdot B, \infty \cdot C, \infty \cdot D \}
\{ \infty \cdot A, \infty \cdot B, \infty \cdot C, \infty \cdot D \}
</math>
</math>
In this case, the number of permutations of length n that can be formed from the 4 different characters is given by the simple expression,
<math>
n^4
</math>
More generally, if we have an infinite multiset with k distinct elements,
<math>
\{ \infty \cdot r_1, \infty \cdot r_2, \dots, \infty \cdot r_k \}
</math>
then the number of words of length n that can be formed is given by:
<math>n^k</math>
===Enumerating - Stars and Bars===
{{Main|Stars and Bars}}
The stars and bars theorem/approach is another way to look at assembling/enumerating permutations.
If we have n objects being placed into k partitions, we can think of this as placing k-1 bars among the n objects.
In this case, the total number of slots for objects + bars is n + k - 1, and we are placing k - 1 bars, so we have the total number of outcomes given by
<math>
\binom{n+k-1}{k-1}
</math>
==Generating==
The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his [[AOCP|Art of Computer Programming]] volumes.)
We can generate permutations in several ways:
* lexicographic order, or sorted order
* de Bruijn cycles, where we remove an item from the front and add an item to the rear
* binary reflected Gray code, where we change only a single item at a time
* Cool-lexicographic order (see http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf)
===Cool-Lexicographic Order===
{{Main|Cool}}
This is a nice simple algorithm that implements an iterative, non-recursive, loopless permutation generator.
Algorithm:
* Visits every permutation of integer multiset E
* Call to init(E) creates singly linked list that stores elements of E in non-increasing order
* head, min, and inc point to first, second to last, and last nodes, respectively
* All variables are pointers
* phi is null
* If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations:
** 4 2 1 1
** 1 4 2 1
** 4 1 2 1
Notes on variables:
* head points to first node of current permutation
* i points to ith node of current permutation
* afteri points to the (i+1)st node of current permutation
to apply the prefix shift of length k, the two pointers k and beforek are used:
* k points to the kth node of the current permutation
* beforek points to the (k-1)st node of the current permutation
Note that the visit(head) call denotes a new permutation that has been generated
* visit(head) happens when new permutation pointed to by head
* Represents passing control back to consumer requesting permutations
* Work in init(E) would also be done by consumer
When does the loop stop?
* Condition on loop ensures algorithm continues until it generates tail(E)
* Rather than generating tail(E) as separate case after oop ends, it initializes linked list to tail(E)
* This is the very first permutation visited
Here is the algorithm:
<pre>
[head, i afteri] <- init(E)
visit(head)
while (afteri.next != null or afteri.value < head.value) do:
    if (afteri.next != null and i.value >= afteri.next.value):
        beforek <- afteri
    else
        beforek <- i
    end
    k <- beforek.next
    beforek.next <- k.next
    k.next <- head
    if (k.value < head.value):
        i <- k
    end
    afteri <- i.next
    head <- k
    visit(head)
end
</pre>
=Flags=
{{CombinatoricsFlag}}
{{AlgorithmsFlag}}
[[Category:Permutations]]
[[Category:Strings]]
[[Category:Combinatorics]]

Latest revision as of 08:56, 9 March 2019

Skiena Chapter 7

Question 14

Write a function to find all permutations of the letters in a particular string (example: HELLO).

Enumerating

Enumerating - Easy Case

Let's start by counting the total number of permutations of letters in a particular string.

Consider the simplest case, no repeated letters: ABCD

Each position can hold any of the four letters, but we can only choose a letter once. So the total number of permutations of a k-letter string with no repeated letters is $ k! $

Mathematically, we are picking 4 distinct objects, and we are picking all 4 objects at the same time. In other words, 4 pick 4.

This leads to a total number of possible strings:

$ P(4,4) = \dfrac{4!}{(4-4)!} = \dfrac{4!}{1} = 4! $

We are using pick, not choose, because order matters. Including a 4! on the bottom would indicate that order does not matter, and would lead to one possible outcome (obvious - any permutation with the letters ABCD will always be the same set of letters, but in different order, so there is only one outcome).

Enumerating - Multiset Case

Now consider the less trivial case where we have repeated letters: AAABCDD

To count the number of permutations of a multiset, we should use multinomial coefficients. (Note: see AOCP/Multisets and AOCP/Multinomial Coefficients).

Mathematically, this problem is what is known as a multiset problem, where we have a set of letters each occurring with some frequency:

$ \{ a \cdot A, b \cdot B, c \cdot C, d \cdot D \} $

In this case we can write the number of possible permutations by placing each character, one at a time.

Start by placing the A characters. The number of slots to place the A characters is (a) choose (number of slots):

$ \binom{a+b+c+d}{a} $

Once the As have been placed, we can enumerate the B characters The number of slots to place the B characters is (b) choose (number of slots, less A characters):

$ \binom{b+c+d}{b} $

Next, the Cs can be placed, where the number of slots to place the C characters is (c) choose (number of slots, less A and B characters):

$ \binom{c+d}{c} $

and finally, all of the other characters are placed, leaving the Ds with a single remaining configuration:

$ \binom{d}{d} $

Now, the total number of permutations is the product of each of these. The number of configurations of a multiset is given by the multinomial coefficient

$ \binom{n}{a, b, c, d} = \binom{n}{a} \binom{n-a}{b} \binom{n-a-b}{c} \binom{n-a-b-c}{d} = \dfrac{n!}{a! b! c! d!} $

More generally, for a multiset with k objects (denoted a) occurring a certain number of times (denoted r),

$ \{ r_1 \cdot a_1, r_2 \cdot a_2, \dots, r_k \cdot a_k \} $

then if there are n total slots $ r_1 + r_2 + \dots + r_k = n $, the total number of permutations is given by the multinomial coefficient,

$ \binom{n}{r_1, r_2, \dots, r_k} = \binom{n}{r_1} \binom{n-r_1}{r_2} \binom{n - r_1 - r_2}{r_3} \dots $

This can be expressed more concisely in terms of factorials:

$ \binom{n}{r_1, r_2, \dots, r_k} = \dfrac{n!}{r_1! r_2! \dots r_k!} $

Enumerating - Infinite Letters Case

Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.

We can still treat this as a multiset problem, but this time the number of characters is infinite. This multiset is denoted:

$ \{ \infty \cdot A, \infty \cdot B, \infty \cdot C, \infty \cdot D \} $

In this case, the number of permutations of length n that can be formed from the 4 different characters is given by the simple expression,

$ n^4 $

More generally, if we have an infinite multiset with k distinct elements,

$ \{ \infty \cdot r_1, \infty \cdot r_2, \dots, \infty \cdot r_k \} $

then the number of words of length n that can be formed is given by:

$ n^k $

Enumerating - Stars and Bars

The stars and bars theorem/approach is another way to look at assembling/enumerating permutations.

If we have n objects being placed into k partitions, we can think of this as placing k-1 bars among the n objects.

In this case, the total number of slots for objects + bars is n + k - 1, and we are placing k - 1 bars, so we have the total number of outcomes given by

$ \binom{n+k-1}{k-1} $

Generating

The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his Art of Computer Programming volumes.)

We can generate permutations in several ways:

  • lexicographic order, or sorted order
  • de Bruijn cycles, where we remove an item from the front and add an item to the rear
  • binary reflected Gray code, where we change only a single item at a time
  • Cool-lexicographic order (see http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf)

Cool-Lexicographic Order

This is a nice simple algorithm that implements an iterative, non-recursive, loopless permutation generator.

Algorithm:

  • Visits every permutation of integer multiset E
  • Call to init(E) creates singly linked list that stores elements of E in non-increasing order
  • head, min, and inc point to first, second to last, and last nodes, respectively
  • All variables are pointers
  • phi is null
  • If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations:
    • 4 2 1 1
    • 1 4 2 1
    • 4 1 2 1

Notes on variables:

  • head points to first node of current permutation
  • i points to ith node of current permutation
  • afteri points to the (i+1)st node of current permutation

to apply the prefix shift of length k, the two pointers k and beforek are used:

  • k points to the kth node of the current permutation
  • beforek points to the (k-1)st node of the current permutation

Note that the visit(head) call denotes a new permutation that has been generated

  • visit(head) happens when new permutation pointed to by head
  • Represents passing control back to consumer requesting permutations
  • Work in init(E) would also be done by consumer

When does the loop stop?

  • Condition on loop ensures algorithm continues until it generates tail(E)
  • Rather than generating tail(E) as separate case after oop ends, it initializes linked list to tail(E)
  • This is the very first permutation visited

Here is the algorithm:

[head, i afteri] <- init(E)
visit(head)
while (afteri.next != null or afteri.value < head.value) do:
    if (afteri.next != null and i.value >= afteri.next.value):
        beforek <- afteri
    else
        beforek <- i
    end
    k <- beforek.next
    beforek.next <- k.next
    k.next <- head
    if (k.value < head.value):
        i <- k
    end
    afteri <- i.next
    head <- k
    visit(head)
end



Flags