From charlesreid1

(Created page with "=Skiena Chapter 7= ==Question 14== Write a function to find all permutations of the letters in a particular string (example: HELLO). ===Enumerating=== Let's start by count...")
 
 
(18 intermediate revisions by the same user not shown)
Line 5: Line 5:
Write a function to find all permutations of the letters in a particular string (example: HELLO).
Write a function to find all permutations of the letters in a particular string (example: HELLO).


===Enumerating===
==Enumerating==
 
===Enumerating - Easy Case===


Let's start by counting the total number of permutations of letters in a particular string.
Let's start by counting the total number of permutations of letters in a particular string.
Line 23: Line 25:
We are using pick, not choose, because order matters. Including a 4! on the bottom would indicate that order does not matter, and would lead to one possible outcome (obvious - any permutation with the letters ABCD will always be the same set of letters, but in different order, so there is only one outcome).
We are using pick, not choose, because order matters. Including a 4! on the bottom would indicate that order does not matter, and would lead to one possible outcome (obvious - any permutation with the letters ABCD will always be the same set of letters, but in different order, so there is only one outcome).


===Enumerating - Multiset Case===
Now consider the less trivial case where we have repeated letters: AAABCDD
To count the number of permutations of a multiset, we should use multinomial coefficients. (Note: see [[AOCP/Multisets]] and [[AOCP/Multinomial Coefficients]]).
Mathematically, this problem is what is known as a multiset problem, where we have a set of letters each occurring with some frequency:
<math>
\{ a \cdot A, b \cdot B, c \cdot C, d \cdot D \}
</math>
In this case we can write the number of possible permutations by placing each character, one at a time.
Start by placing the A characters. The number of slots to place the A characters is (a) choose (number of slots):
<math>
\binom{a+b+c+d}{a}
</math>
Once the As have been placed, we can enumerate the B characters The number of slots to place the B characters is (b) choose (number of slots, less A characters):


<math>
\binom{b+c+d}{b}
</math>


Next, the Cs can be placed, where the number of slots to place the C characters is (c) choose (number of slots, less A and B characters):


<math>
\binom{c+d}{c}
</math>
and finally, all of the other characters are placed, leaving the Ds with a single remaining configuration:
<math>
\binom{d}{d}
</math>
Now, the total number of permutations is the product of each of these. The number of configurations of a multiset is given by the '''multinomial coefficient'''
<math>
\binom{n}{a, b, c, d} = \binom{n}{a} \binom{n-a}{b} \binom{n-a-b}{c} \binom{n-a-b-c}{d} = \dfrac{n!}{a! b! c! d!}
</math>
More generally, for a multiset with k objects (denoted a) occurring a certain number of times (denoted r),
<math>
\{ r_1 \cdot a_1, r_2 \cdot a_2, \dots, r_k \cdot a_k \}
</math>
then if there are n total slots <math>r_1 + r_2 + \dots + r_k = n</math>, the total number of permutations is given by the multinomial coefficient,
<math>
\binom{n}{r_1, r_2, \dots, r_k} = \binom{n}{r_1} \binom{n-r_1}{r_2} \binom{n - r_1 - r_2}{r_3} \dots
</math>
This can be expressed more concisely in terms of factorials:
<math>
\binom{n}{r_1, r_2, \dots, r_k} = \dfrac{n!}{r_1! r_2! \dots r_k!}
</math>
===Enumerating - Infinite Letters Case===
Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.
We can still treat this as a multiset problem, but this time the number of characters is infinite. This multiset is denoted:
<math>
\{ \infty \cdot A, \infty \cdot B, \infty \cdot C, \infty \cdot D \}
</math>
In this case, the number of permutations of length n that can be formed from the 4 different characters is given by the simple expression,
<math>
n^4
</math>
More generally, if we have an infinite multiset with k distinct elements,
<math>
\{ \infty \cdot r_1, \infty \cdot r_2, \dots, \infty \cdot r_k \}
</math>
then the number of words of length n that can be formed is given by:
<math>n^k</math>
===Enumerating - Stars and Bars===
{{Main|Stars and Bars}}
The stars and bars theorem/approach is another way to look at assembling/enumerating permutations.
If we have n objects being placed into k partitions, we can think of this as placing k-1 bars among the n objects.
In this case, the total number of slots for objects + bars is n + k - 1, and we are placing k - 1 bars, so we have the total number of outcomes given by
<math>
\binom{n+k-1}{k-1}
</math>


==Generating==


The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his [[AOCP|Art of Computer Programming]] volumes.)


Now consider the less trivial case where there are repeated letters: HELLO
We can generate permutations in several ways:
* lexicographic order, or sorted order
* de Bruijn cycles, where we remove an item from the front and add an item to the rear
* binary reflected Gray code, where we change only a single item at a time
* Cool-lexicographic order (see http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf)


Each position can hold any of the four letters, and one of the letters can be chosen twice.
===Cool-Lexicographic Order===


{{Main|Cool}}


This is a nice simple algorithm that implements an iterative, non-recursive, loopless permutation generator.


Algorithm:
* Visits every permutation of integer multiset E
* Call to init(E) creates singly linked list that stores elements of E in non-increasing order
* head, min, and inc point to first, second to last, and last nodes, respectively
* All variables are pointers
* phi is null
* If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations:
** 4 2 1 1
** 1 4 2 1
** 4 1 2 1


Notes on variables:
* head points to first node of current permutation
* i points to ith node of current permutation
* afteri points to the (i+1)st node of current permutation


to apply the prefix shift of length k, the two pointers k and beforek are used:
* k points to the kth node of the current permutation
* beforek points to the (k-1)st node of the current permutation


Note that the visit(head) call denotes a new permutation that has been generated
* visit(head) happens when new permutation pointed to by head
* Represents passing control back to consumer requesting permutations
* Work in init(E) would also be done by consumer


This is a stars-and-bars problem.
When does the loop stop?
* Condition on loop ensures algorithm continues until it generates tail(E)
* Rather than generating tail(E) as separate case after oop ends, it initializes linked list to tail(E)
* This is the very first permutation visited


First, consider the simple case: a single letter, occurring a single time. Then we have 5 - 1 = 4 letters remaining. By placing the E at different positions in the string, we split these letters into 1 + 1 = 2 groups.
Here is the algorithm:


Next, consider the complicated case: a double-letter like LL. Then we have 5 - 2 = 3 letters remaining. By placing the L at different positions in the string, we split these letters into 2 + 1 = 3 groups, some of which will potentially be empty.
<pre>
[head, i afteri] <- init(E)
visit(head)
while (afteri.next != null or afteri.value < head.value) do:
    if (afteri.next != null and i.value >= afteri.next.value):
        beforek <- afteri
    else
        beforek <- i
    end
    k <- beforek.next
    beforek.next <- k.next
    k.next <- head
    if (k.value < head.value):
        i <- k
    end
    afteri <- i.next
    head <- k
    visit(head)
end
</pre>


A generating function approach to this problem would help us determine, for large strings, how long this will take.






Now we have turned this problem into a more familiar one.
=Flags=


{{CombinatoricsFlag}}


{{AlgorithmsFlag}}


To generate the partitions: if we are partitioning k letters, we have k+1 positions.
[[Category:Permutations]]


Fencepost structure: k steps, and 1 step before or after
[[Category:Strings]]


Actual Java implementation:
[[Category:Combinatorics]]
* Would ideally design this as an iterator
*

Latest revision as of 08:56, 9 March 2019

Skiena Chapter 7

Question 14

Write a function to find all permutations of the letters in a particular string (example: HELLO).

Enumerating

Enumerating - Easy Case

Let's start by counting the total number of permutations of letters in a particular string.

Consider the simplest case, no repeated letters: ABCD

Each position can hold any of the four letters, but we can only choose a letter once. So the total number of permutations of a k-letter string with no repeated letters is $ k! $

Mathematically, we are picking 4 distinct objects, and we are picking all 4 objects at the same time. In other words, 4 pick 4.

This leads to a total number of possible strings:

$ P(4,4) = \dfrac{4!}{(4-4)!} = \dfrac{4!}{1} = 4! $

We are using pick, not choose, because order matters. Including a 4! on the bottom would indicate that order does not matter, and would lead to one possible outcome (obvious - any permutation with the letters ABCD will always be the same set of letters, but in different order, so there is only one outcome).

Enumerating - Multiset Case

Now consider the less trivial case where we have repeated letters: AAABCDD

To count the number of permutations of a multiset, we should use multinomial coefficients. (Note: see AOCP/Multisets and AOCP/Multinomial Coefficients).

Mathematically, this problem is what is known as a multiset problem, where we have a set of letters each occurring with some frequency:

$ \{ a \cdot A, b \cdot B, c \cdot C, d \cdot D \} $

In this case we can write the number of possible permutations by placing each character, one at a time.

Start by placing the A characters. The number of slots to place the A characters is (a) choose (number of slots):

$ \binom{a+b+c+d}{a} $

Once the As have been placed, we can enumerate the B characters The number of slots to place the B characters is (b) choose (number of slots, less A characters):

$ \binom{b+c+d}{b} $

Next, the Cs can be placed, where the number of slots to place the C characters is (c) choose (number of slots, less A and B characters):

$ \binom{c+d}{c} $

and finally, all of the other characters are placed, leaving the Ds with a single remaining configuration:

$ \binom{d}{d} $

Now, the total number of permutations is the product of each of these. The number of configurations of a multiset is given by the multinomial coefficient

$ \binom{n}{a, b, c, d} = \binom{n}{a} \binom{n-a}{b} \binom{n-a-b}{c} \binom{n-a-b-c}{d} = \dfrac{n!}{a! b! c! d!} $

More generally, for a multiset with k objects (denoted a) occurring a certain number of times (denoted r),

$ \{ r_1 \cdot a_1, r_2 \cdot a_2, \dots, r_k \cdot a_k \} $

then if there are n total slots $ r_1 + r_2 + \dots + r_k = n $, the total number of permutations is given by the multinomial coefficient,

$ \binom{n}{r_1, r_2, \dots, r_k} = \binom{n}{r_1} \binom{n-r_1}{r_2} \binom{n - r_1 - r_2}{r_3} \dots $

This can be expressed more concisely in terms of factorials:

$ \binom{n}{r_1, r_2, \dots, r_k} = \dfrac{n!}{r_1! r_2! \dots r_k!} $

Enumerating - Infinite Letters Case

Suppose we have an infinite pool of characters, and we wish to know the number of words of a given length that can be formed using these infinite pools of letters.

We can still treat this as a multiset problem, but this time the number of characters is infinite. This multiset is denoted:

$ \{ \infty \cdot A, \infty \cdot B, \infty \cdot C, \infty \cdot D \} $

In this case, the number of permutations of length n that can be formed from the 4 different characters is given by the simple expression,

$ n^4 $

More generally, if we have an infinite multiset with k distinct elements,

$ \{ \infty \cdot r_1, \infty \cdot r_2, \dots, \infty \cdot r_k \} $

then the number of words of length n that can be formed is given by:

$ n^k $

Enumerating - Stars and Bars

The stars and bars theorem/approach is another way to look at assembling/enumerating permutations.

If we have n objects being placed into k partitions, we can think of this as placing k-1 bars among the n objects.

In this case, the total number of slots for objects + bars is n + k - 1, and we are placing k - 1 bars, so we have the total number of outcomes given by

$ \binom{n+k-1}{k-1} $

Generating

The task of actually generating all of the permutations of words that contain a certain set of characters is an extremely important one. (Knuth covers this exclusively in Volume 4 Facsimile 2 of his Art of Computer Programming volumes.)

We can generate permutations in several ways:

  • lexicographic order, or sorted order
  • de Bruijn cycles, where we remove an item from the front and add an item to the rear
  • binary reflected Gray code, where we change only a single item at a time
  • Cool-lexicographic order (see http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf)

Cool-Lexicographic Order

This is a nice simple algorithm that implements an iterative, non-recursive, loopless permutation generator.

Algorithm:

  • Visits every permutation of integer multiset E
  • Call to init(E) creates singly linked list that stores elements of E in non-increasing order
  • head, min, and inc point to first, second to last, and last nodes, respectively
  • All variables are pointers
  • phi is null
  • If E = {1,1,2,4}, then first three visit(E) calls will produce the configurations:
    • 4 2 1 1
    • 1 4 2 1
    • 4 1 2 1

Notes on variables:

  • head points to first node of current permutation
  • i points to ith node of current permutation
  • afteri points to the (i+1)st node of current permutation

to apply the prefix shift of length k, the two pointers k and beforek are used:

  • k points to the kth node of the current permutation
  • beforek points to the (k-1)st node of the current permutation

Note that the visit(head) call denotes a new permutation that has been generated

  • visit(head) happens when new permutation pointed to by head
  • Represents passing control back to consumer requesting permutations
  • Work in init(E) would also be done by consumer

When does the loop stop?

  • Condition on loop ensures algorithm continues until it generates tail(E)
  • Rather than generating tail(E) as separate case after oop ends, it initializes linked list to tail(E)
  • This is the very first permutation visited

Here is the algorithm:

[head, i afteri] <- init(E)
visit(head)
while (afteri.next != null or afteri.value < head.value) do:
    if (afteri.next != null and i.value >= afteri.next.value):
        beforek <- afteri
    else
        beforek <- i
    end
    k <- beforek.next
    beforek.next <- k.next
    k.next <- head
    if (k.value < head.value):
        i <- k
    end
    afteri <- i.next
    head <- k
    visit(head)
end



Flags