Priority Queues/Timing and Performance/Old: Difference between revisions
From charlesreid1
| Line 110: | Line 110: | ||
===Further Results=== | ===Further Results=== | ||
The big-O behavior for add and remove operations is likely being skewed by the overhead cost of the operations - the large O(1) mentioned above. When looking at the lower end of the curves given above, for the amortized cost of remove operations, the curve looks approximately linear (contrary to the expected behavior) and the add operational cost appears to be O(1), also contrary to expectations. Both of these trends are established after jitter for small values of N. Here is another plot of those quantities, up to array sizes of 100,000. | |||
(Note that the number of trials was also increased from 200 to 1000.) | |||
[[Image:PriorityQueueTiming_Sorted3.png|500px]] | [[Image:PriorityQueueTiming_Sorted3.png|500px]] | ||
If we look exclusively at arrays of size 500,000 to 1,000,000 (intervals of 50,000), and include a corresponding increase in the number of trials from 200 to 1000, we see the behavior below: | |||
[[Image:PriorityQueueTiming_Sorted4.png|500px]] | |||
=Flags= | =Flags= | ||
Revision as of 18:26, 19 June 2017
Priority Queues
Priority queues are queues that keep items in the queue in a sorted order, so that the minimum (highest priority) item comes out first.
Priority queue timing hypothesis
The hypothesis is that we will see the following behavior for sorted and unsorted implementations of priority queues:
- Unsorted list - add is O(1), min/remove min is O(N)
- Sorted list - add is O(N), min/remove min is O(1)
Priority queue timing class
Below is a basic class for measuring timing and performance of a sorted priority queue. The basic rundown of what the class does is as follows:
- Build up the CSV output using a StringBuffer
- Loop over different values of N, the size of the array. This is the number of add/remove operations being performed in total on one given array.
- Loop over a "large" number of statistical trials. The average access time over all the trials is reported.
Link on git.charlesreid1.com: https://charlesreid1.com:3000/cs/java/src/master/priority-queues/Timing.java
Class contents:
import java.util.LinkedList;
import java.util.Random;
/** Timing class: measure big-O complexity and runtime of data structures.
*
* Compare algorithms, test structures, and verify expected big-O behavior.
*
*/
public class Timing {
// Tests
public static void main(String[] args) {
sorted_timing();
}
/** Time sorted priority queue. */
public static void sorted_timing() {
// This generates CSV files to verify the following information:
// - add method is O(N)
// - remove min method is O(1)
StringBuffer sb = new StringBuffer();
sb.append("N, Walltime Add (ms), Walltime Rm Min (ms)\n");
int ntrials = 200;
Random r = new Random();
// Loop over values of N
for(int N = (int)(5E3); N <= (int)(5E5); N+=2500) {
Tim add_tim = new Tim();
Tim rm_tim = new Tim();
// Trials counter is always k for Kafka
for(int k = 0; k<ntrials; k++) {
// Each trial is a different sequence of random numbers,
// but the sequence matches between tests of different collection types
SortedPriorityQueue<Integer> q = new SortedPriorityQueue<Integer>();
Integer key = new Integer( r.nextInt() );
Integer val = new Integer( r.nextInt() );
add_tim.tic();
for(int i=0; i<N; i++) {
q.add(key,val);
}
add_tim.toc();
rm_tim.tic();
for(int i=0; i<N; i++) {
q.removeMin();
}
rm_tim.toc();
}
sb.append( String.format("%d, ",N) );
sb.append( String.format("%.3f, ", add_tim.elapsedms()/ntrials) );
sb.append( String.format("%.3f ", rm_tim.elapsedms()/ntrials) );
sb.append("\n");
}
System.out.println(sb.toString());
}
}
Priority queue timing results
Sorted priority queue
The results below are for a sorted priority queue. For a sorted priority queue, the minimum is always at the front, and so removal is an O(1) operation. When items are added to the priority queue they are added in order, so add is an O(N) operation. If you squint and look sideways, you can see a barely perceptible linear increase in the cost of add, versus the more flat curve for removal for a sorted list.
When I gave it another try, bumping up the maximum size to a million, I got still further ambiguous results...
Results discussion
To be honest, I'm at a loss to explain exactly what's happening here. The code appears to be spending a constant (or veeeeery slowly increasing) amount of time on the add operation, even though it is performing an insertion sort each time. On the other hand, the remove operations are definitely confirmed to be O(1).
Two possibilities:
- Built-in doubly-linked list access operations are just so damn fast that most of the time spent to retrieve an item is on function overhead, which is O(1), so what we're actually seeing is not an O(N) increase in cost but a LARGE O(1) + SMALL O(N) increase.
- Bug in the timing code that's causing measured times to be sub-linear.
Further Results
The big-O behavior for add and remove operations is likely being skewed by the overhead cost of the operations - the large O(1) mentioned above. When looking at the lower end of the curves given above, for the amortized cost of remove operations, the curve looks approximately linear (contrary to the expected behavior) and the add operational cost appears to be O(1), also contrary to expectations. Both of these trends are established after jitter for small values of N. Here is another plot of those quantities, up to array sizes of 100,000.
(Note that the number of trials was also increased from 200 to 1000.)
If we look exclusively at arrays of size 500,000 to 1,000,000 (intervals of 50,000), and include a corresponding increase in the number of trials from 200 to 1000, we see the behavior below:
File:PriorityQueueTiming Sorted4.png
Flags
| Stacks and Queues Part of Computer Science Notes
Series on Data Structures
Stacks and Queues: Python StacksQueues/Python · StacksQueues/Python/ArrayStack · StacksQueues/Python/ArrayQueue · StacksQueues/Python/ArrayDeque StacksQueues/Python/LinkedStack
Stacks and Queues: Java StacksQueues/Java · StacksQueues/Java/ArrayStack · StacksQueues/Java/ArrayQueue · StacksQueues/Java/ArrayQueueFS · StacksQueues/Java/ArrayDeque StacksQueues/Java/LinkedStack · StacksQueues/Java/LinkedQueue · StacksQueues/Java/LinkedDeque
Applications Postfix_Expressions#Stacks · StacksQueues/Subsets · StacksQueues/Subsets/Java
|