Algorithmic Analysis of Substring Pattern Matching
From charlesreid1
Consider the following substring matching program:
int findmatch(char *p, char *t) {
int i,j;
int m,n;
m = strlen(p);
n = strlen(t);
for(i=0; i<=(n-m); i=i+1) {
j=0;
while((j<m)&&(t[i+j]==p[j]))
j = j+1;
if(j==m)
return(i);
}
}
The worst case running time of the two nested for loops can be analyzed as follows.
The two strlen() calls are O(n) and O(m), respectively.
The for loop runs a maximum of (n-m) times.
The inner for loop runs a maximum of m times.
This the overall algorithm $ O(n) + O(m) + O((n-m)(m)) $
The leading terms go away, and the last term becomes:
$ O((n-m)(m)) \sim O(nm) - O(m^2) $
Because we know that n is much greater than m, and because of the leading minus term, the O(m^2) term goes away as well. The final analysis is that the algorithm cost is $ O(nm) $.