Revision as of 07:05, 9 August 2017

Notes

Skiena Chapter 8

Fibonacci dynamic programming example

Skiena begins the chapter with the Fibonacci example.

Worst way to do Fibonacci calculation: using recursion

More efficient way to do Fibonacci calculation: cache the intermediate results

Most efficient way: specify the order of evaluation in such a way that the results that are needed are always available. No caching necessary. It's a bit trivial once you reach this point, but it is still dynamic programming - you're just specifying the order in which to evaluate the sub-problems so that you always have the information you need.

Binomial dynamic programming example

Skiena uses the binomial coefficient as an example of a problem with a single expression that can be evaluated, but that single expression will cause integer overflow.

Instead: use identity,

$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$

and the base cases:

$\binom{p}{0} = 1$

$\binom{p}{1} = p$

$\binom{p}{p} = 1$

To implement this, we can build a table/array, which will contain Pascal's triangle...

Left column: m, top column: n

Now the order of operations is to fill in the edges of Pascal's triangle (the base cases) first

Initialize column 0 = 1
Initialize diagonal = 1

Now loop over each row and each column of the row to compute the binomial coefficient:

for i = 1 to n:
    for j = 1 to i:
        calculate binomial coefficient

Fuzzy string matching example

The next example Skiena covers is an important one: fuzzy string matching.

We are trying to pattern-match the string P in a larger text T.

To do approximate string matching, we must begin with a cost function. The cost function assigns a cost to each possible modification of our original string.

3 classes of edits:

Substitution (turning one letter into another)
Insertion
Deletion

To find the minimum cost function to turn our pattern P into something in the original text T, we need to work backwards: from the ending point, work backwards 1 edit at a time.

Maintain a table:

D(i,j) is the distance (the number of differences) between P1, P2, ..., Pi and the segment of T ending at j
D(i,j) is the minimum of the three possible ways to extend the current string
In other words, it is the minimum of the following:

Substitution: if $$ P_i = T_j $$ then $$ D(i-1, j-1) $$ , otherwise $$ D(i-1, j-1) + 1 $$

Insert: $$ D(i-1, j) + 1 $$

Delete: $$ D(i, j-1) + 1 $$

Pseudocode for program:

# Non-Dynamic Programming Version of Fuzzy String Matching
def string_compare(s, t, i, j):

	# Base case
	if i==0 or j==0:
		return 

	# Option: Match
	match = string_compare(s, t, i-1, j-1) + match(s[i], t[j])

	# Option: Insert
	insert = string_compare(s, t, i, j-1) + insertdelete(t[j])

	# Option: Delete
	delete = string_compare(s, t, i-1, j) + insertdelete(s[i])

	# Pick minimum of these
	return min(match, insert, delete)

But this is horribly inefficient. Why? Because it does not memoize. This means that it is repeating the same tasks over and over again.

Fuzzy string matching dynamic programming example

Define a C struct to hold information for cells. Each cell is a particular D(i,j) location.

Each cell holds cost, and parent cell. Now, in place of a recursive call, we have a table/cell lookup.

There is a fixed number of entries: size of P times size of T $|T| \times |P|$ . The table has one column for each character in T, and one row for each character in P.

There are two ways to approach this problem:

We can get the edit distance, but not the actual sequence of edits. This is a bit easier.
We can get the actual sequence of edits. This is a little harder, but more useful.

To reconstruct the tour, we can use the cost matrix D(i,j) and work backwards to reconstruct the tour from D(i,j).

Here are the functions to define:

Initialization - this creates the "boundary conditions" or base cases for the dynamic program. This is the starting point that we build on
Penalty cost - this is a function that determines the actual cost of each edit (modify, insert, delete). The simplest case assigns a cost of 1, but this can tackle more general problems by making it a function.
Goal cell - this is a function that returns the indices of the endpoint cell. Again, the simplest case is the last cell in the table (which has a fixed size), but making this a function makes it more general.
Traceback - defining a "trace" action allows printing information about where in the dynamic program we are (for example, printing out the name of each operation, so we can see MMDMDIIMMD to indicate the sequence of edits being made)

Dynamic programming algorithms are easy to design, but getting boundary conditions and index manipulations correct requires great care.

Why is there so much extra fluff?

Fuzzy string matching is solved, but the extra fluff solves other problems too
Fuzzy string matching in large text body (slight modifications required to row initialization and goal cell functions)
Largest common subsequence - finding number of characters in common between two strings, also called the LCS (longest common subsequence) problem; requires modifying match cost function
Maximum monotone subsequence - finding the fewest numbr of elements to delete to make a sequence monotonic - for example, 243517698 -> 23568 (this is identical to the LCS problem)

Longest Increasing Sequence Example

Three dynamic programming steps:

1. Formulate answer as recurrence relation/recursive algorithm

2. Show that the number of parameters taken on by the recurrence is bounded by a polynomial

3. Specify the order of evaluation for the recurrence so that partial results that are needed are always available

Longest increasing (non-contiguous) subsequence:

Step 1 - formulate answer as recurrence relation

We want to know the longest subsequence in the preceding numbers - if we are considering $$ S_n $$ , we want to know the longest increasing subsequence in $S_1, S_2, \dots, S_{n-1}$
However, we also want to know the length of the longest sequence that $$ S_n $$ will extend

Step 2 - To formulate the recurrence relation:

Let L_i be length of the longest subsequence that ends with S_i
Base case: L_0 = 0
L_i given by:

$L_i = \max_{0 < j < i} L_j + 1 \qquad S_j < S_i$

Skiena does not explain this very clearly, in particular the case where i = 1, so... not sure what's going on there.

Code

Link to practice problems [1]

Racing Gems

Link: https://charlesreid1.com:3000/cs/java/src/master/dynamic-programming/racing-gems

This problem is an ICPC programming competition problem.

In this problem, we have a race track of width x = 0 to x = w, height y = 0 to y = h. We have a character that begins at the x-axis (y=0) and moves toward the finish line (y=h) with a steady velocity v.

You control the character's horizontal velocity, -v/r to +v/r, and your goal is to obtain as many gems as possible.

The location of gems are given by (x,y) integer pairs on the grid. Determine the maximum number of gems that can be obtained by the character.

Git repo has solution explanation: https://charlesreid1.com:3000/cs/java/src/master/dynamic-programming/racing-gems

Maximum Value Contiguous Subsequence

Link: https://charlesreid1.com:3000/cs/java/src/master/dynamic-programming/max-val-seq

Given a sequence of integers, $a_1, a_2, \dots, a_n$ , determine the subsequence of integers with the maximum sum.

(Note, this problem is only interesting if there are negative numbers.)

Resources

MIT Open Courseware 6.006

Links to Videos

Dynamic programming 1: Fibonacci, shortest paths: https://www.youtube.com/watch?v=OQ5jsbhAv_M

Dynamic programming 2: Blackjack: https://www.youtube.com/watch?v=ENyox7kNKeY

Dynamic programming 3: Parenthesization, edit distance, knapsack: https://www.youtube.com/watch?v=ocZMDMZwhCY

Dynamic programming 4: guitars, tetris, super mario: https://www.youtube.com/watch?v=tp4_UXaVyx8

MIT Open Courseware 6.046

Links to Videos

The topic of dynamic programming is covered in the MIT 6.046 open courseware playlist: https://www.youtube.com/watch?v=2P-yW7LQr08&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp

The course begins with dynamic programming, by covering the divide and conquer approach for finding a convex hull: https://www.youtube.com/watch?v=EzeYI7p9MjU&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=2

It continues with a divide and conquer approach for matrix algebra and Fast Fourier Transforms: https://www.youtube.com/watch?v=iTMn0Kt18tg&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=4

It then covers divide and conquer for van Emde Boas trees: https://www.youtube.com/watch?v=hmReJCupbNU&index=6&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp

Recitation with more dynamic programming examples: https://www.youtube.com/watch?v=krZI60lKPek&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp&index=12

Advanced dynamic programming: https://www.youtube.com/watch?v=Tw1k46ywN6E&index=14&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp

Dynamic programming for all pairs, shortest paths: https://www.youtube.com/watch?v=NzgFUwOaoIw&index=15&list=PLUl4u3cNGP6317WaSNfmCvGym2ucw3oGp

@@ Line 162: / Line 162: @@
 =Code=
+Link to practice problems [https://people.cs.clemson.edu/~bcdean/dp_practice/]
 ==Racing Gems==
@@ Line 184: / Line 186: @@
 (Note, this problem is only interesting if there are negative numbers.)
 =Resources=

Algorithms/Dynamic Programming: Difference between revisions

From charlesreid1

Revision as of 07:05, 9 August 2017

Contents

Notes

Skiena Chapter 8

Fibonacci dynamic programming example

Binomial dynamic programming example

Fuzzy string matching example

Fuzzy string matching dynamic programming example

Longest Increasing Sequence Example

Code

Racing Gems

Maximum Value Contiguous Subsequence

Resources

MIT Open Courseware 6.006

Links to Videos

MIT Open Courseware 6.046

Links to Videos