Algorithms/Combinatorics and Heuristics: Difference between revisions
From charlesreid1
No edit summary |
No edit summary |
||
| Line 4: | Line 4: | ||
This chapter covers the basic ideas of searching for solutions for combinatorial/permutation problems | This chapter covers the basic ideas of searching for solutions for combinatorial/permutation problems | ||
===Combinatorial problems and solutions=== | |||
Backtracking: | Backtracking: | ||
| Line 92: | Line 94: | ||
Using the worst of both to solve a sudoku puzzle can take up to 1M function calls, while the best of both can solve with 50 or so. | Using the worst of both to solve a sudoku puzzle can take up to 1M function calls, while the best of both can solve with 50 or so. | ||
===Heuristic search=== | |||
3 techniques covered here are: | |||
* Random sampling (Monte Carlo method) | |||
* Local search | |||
* Simulated annealing | |||
These three techniques have some common components that are required: | |||
* Solution space representation - deciding how to represent the problem (e.g., when solving [[Traveling Salesperson Problem]], deciding which nodes to visit, which have been visited) | |||
* Cost function - a way to rank the quality of each solution | |||
===Random sampling=== | |||
Random sampling/Monte Carlo: | |||
* Solution space must be uniformly sampled | |||
* This can be subtle/tricky | |||
* See selction 14.4-14.7 in Skiena on generating permutations and combinations, etc. | |||
* Basic traveling salesperson problem randomized solution: | |||
<pre> | |||
for i in Nsamples: | |||
generate random solution | |||
calculate cost of solution | |||
if this costs less than best cost: | |||
keep this solution | |||
</pre> | |||
When does random sampling work well? | |||
* High proportion of acceptable solutions (e.g., prime numbers) | |||
* No coherence to solution space (no way to gauge ''closeness'' to a solution) | |||
This solution approach works terribly on the traveling salesperson problem, which has a sparse solution space. | |||
===Implementing Random Sampling=== | |||
There are subtleties to how random sampling should be implemented - how the solution space should be sampled. | |||
For example: suppose we have a graph with vertices, and we are selecting random pairs for vertex swaps: | |||
For a set of vertices <math>{1, \dots, n}</math> we are selecting randomly from the <math>\binomial{n}{2}</math> possible unordered pairs | |||
Solution | |||
* We cannot say, | |||
<pre> | |||
i = rand_int(1, n-1) | |||
j = rand_int(i+1,n) | |||
</pre> | |||
This introduces bias in the random generator! | |||
Possibility of selecting a random pair that is arbitrary, say <math>(p,q)</math>, is given by: | |||
<math> | |||
P(p) = \dfrac{1}{n-1} \qquad P(q) = \dfrac{1}{n-1} \qquadl P(p,q) = \dfrac{1}{(n-1)^2} | |||
</math> | |||
The last term is the probability of getting the pair (p,q). | |||
But now consider the random pair <math>(n-1,n)</math>: in this case the probability of j being n is 100%. | |||
<math> | |||
P(n-1) = \dfrac{1}{n-1} \qquad P(n) = 1 \qquad P(n-1,n) = \dfrac{1}{n-1} | |||
</math> | |||
which is different! | |||
The real solution is to make both random selections unbiased, and throw out duplicate selections: | |||
<pre> | |||
do | |||
i = random(1,n) | |||
j = random(1,n) | |||
while(i==j) | |||
swap(i,j) | |||
</pre> | |||
=Flags= | |||
{{AlgorithmsFlag}} | {{AlgorithmsFlag}} | ||
Revision as of 07:42, 17 July 2017
Notes
Skiena Chapter 7
This chapter covers the basic ideas of searching for solutions for combinatorial/permutation problems
Combinatorial problems and solutions
Backtracking:
- Backtracking as example of combinatorial search (e.g., with N Queens problem you are searching among a space of possible configurations for a solution that satisfies the given constraints)
- Basic skeleton given for a backtracking method:
backtrack_dfs(A,k): if A = (a1, a2, ..., ak) is solution: process solution else: k = k + 1 compute Sk while Sk != null set, do: ak = element of Sk Sk = Sk - ak backtrack_dfs(A,k)
Or, to express in a more terse way:
- Base case: found a solution
- Recursive case:
- For each choice in a set of choices:
- Make a choice
- Explore the consequences
- Unmake the choice
- Move on to the next choice
- For each choice in a set of choices:
A few notes on his program:
- new array each time, inside each method, so set of possible solutions is not passed around - copy is created so free to modify/not restore
- Stubs used fror method calls like processing solution, making move, unmaking move - this enables application of the same function to different problems
Example: Construct all possible subsets
- Each element doubles the number of subsets that can be created
- Number of subsets with 1 element in the set: 2 subsets (empty/full)
- Number of subsets with 2 elements in the set: 4 subsets
- Number of subsets with n elements in the set: 2^n subsets
When solving, the main question is, how do you represent a set?
Example:
- Checking membership of an element in a set
- Can use bit vector - an array of booleans - indicating whether an element is a member of a particular set or not
- The order in which sets are constructed depends on the set S construction procedure, called above
Constructing permutations:
- Permutations do not allow for duplicates
- If we fix a_1, we have n-1 candidates for the second position, n-2 for the third, etc. Total is:
$ n! = \prod_{j=1}^{n} j $
- utilize this to construct solution representations:
- element i of an array can be any eleent that has not already appearedd
- complicated bit vector solution - basically, it says, who's in this permutation - then it adds items that aren't already in the solution to the new permutations
Shortest path:
- Again, a bit complicated with the logic details, but basically:
- Start at node S
- Move to next node ni if it is on the graph and not part of hte solution already
- Repeat
Sudoku puzzle solution:
- backtracking solution
- elements of the solution are the open squares, each of which has coordinates (i,j) and a set of possible digits that can go in that square 1-9
- We know the set of feasible digits are those that do not appear in row i or column j or the local 3x3 sector
- Soltion vector: one integer per position
- To store coordinate (i,j), use a separate array of point (x,y) structs
- Also have a board struct - stores a matrix of board contents as int[][], and a free count (open nuber of squares), and a move array of points
- to pick the next candidate:
- Pick the next open square to fill
- identify candidate solutions for that square
- IMPORTANT: These two routines have a HUGE impact on the runtime!!!
- Then, given a coordinate x and y and a board, if possible, increase the number of andidates
- Make a move funtion
- Unmake a move function (both of these modify the move array)
- Solution check method - are there any open squares on the board?
- process solution: print board, set flag FINISHED to true
Two BIG improvements can be made here:
- Improving the selection of the next square - by carefully picking the next square to solve, we can reduce the search space and tackle the easiest problems first
- Improving the possible values that we are considering
Improving the next square selection procedure: two options
- Local count - disallows candidates that appear in row i, row j (this is the "plain vanilla" default approach)
- Look ahead - check if there are any other squares where making our choice would lead to ZERO candidates; if so, eliminate choice as a possible solution
Using the worst of both to solve a sudoku puzzle can take up to 1M function calls, while the best of both can solve with 50 or so.
Heuristic search
3 techniques covered here are:
- Random sampling (Monte Carlo method)
- Local search
- Simulated annealing
These three techniques have some common components that are required:
- Solution space representation - deciding how to represent the problem (e.g., when solving Traveling Salesperson Problem, deciding which nodes to visit, which have been visited)
- Cost function - a way to rank the quality of each solution
Random sampling
Random sampling/Monte Carlo:
- Solution space must be uniformly sampled
- This can be subtle/tricky
- See selction 14.4-14.7 in Skiena on generating permutations and combinations, etc.
- Basic traveling salesperson problem randomized solution:
for i in Nsamples:
generate random solution
calculate cost of solution
if this costs less than best cost:
keep this solution
When does random sampling work well?
- High proportion of acceptable solutions (e.g., prime numbers)
- No coherence to solution space (no way to gauge closeness to a solution)
This solution approach works terribly on the traveling salesperson problem, which has a sparse solution space.
Implementing Random Sampling
There are subtleties to how random sampling should be implemented - how the solution space should be sampled.
For example: suppose we have a graph with vertices, and we are selecting random pairs for vertex swaps:
For a set of vertices $ {1, \dots, n} $ we are selecting randomly from the $ \binomial{n}{2} $ possible unordered pairs
Solution
- We cannot say,
i = rand_int(1, n-1) j = rand_int(i+1,n)
This introduces bias in the random generator!
Possibility of selecting a random pair that is arbitrary, say $ (p,q) $, is given by:
$ P(p) = \dfrac{1}{n-1} \qquad P(q) = \dfrac{1}{n-1} \qquadl P(p,q) = \dfrac{1}{(n-1)^2} $
The last term is the probability of getting the pair (p,q).
But now consider the random pair $ (n-1,n) $: in this case the probability of j being n is 100%.
$ P(n-1) = \dfrac{1}{n-1} \qquad P(n) = 1 \qquad P(n-1,n) = \dfrac{1}{n-1} $
which is different!
The real solution is to make both random selections unbiased, and throw out duplicate selections:
do
i = random(1,n)
j = random(1,n)
while(i==j)
swap(i,j)
Flags
| Algorithms Part of Computer Science Notes
Series on Algorithms
Algorithms/Sort · Algorithmic Analysis of Sort Functions · Divide and Conquer · Divide and Conquer/Master Theorem Three solid O(n log n) search algorithms: Merge Sort · Heap Sort · Quick Sort Algorithm Analysis/Merge Sort · Algorithm Analysis/Randomized Quick Sort
Algorithms/Search · Binary Search · Binary Search Modifications
Algorithms/Combinatorics · Algorithms/Combinatorics and Heuristics · Algorithms/Optimization · Divide and Conquer
Algorithms/Strings · Algorithm Analysis/Substring Pattern Matching
Algorithm complexity · Theta vs Big O Amortization · Amortization/Aggregate Method · Amortization/Accounting Method Algorithm Analysis/Matrix Multiplication
Estimation Estimation · Estimation/BitsAndBytes
Algorithm Practice and Writeups Project Euler · Five Letter Words · Letter Coverage
|