Algorithmic Analysis of Substring Pattern Matching

Consider the following substring matching program:

int findmatch(char *p, char *t) { 
	int i,j;
	int m,n;

	m = strlen(p);
	n = strlen(t);

	for(i=0; i<=(n-m); i=i+1) {
		j=0;
		while((j<m)&&(t[i+j]==p[j]))
			j = j+1;
		if(j==m) 
			return(i);
	}
}

The worst case running time of the two nested for loops can be analyzed as follows.

The two strlen() calls are O(n) and O(m), respectively.

The for loop runs a maximum of (n-m) times.

The inner for loop runs a maximum of m times.

This the overall algorithm $$ O(n) + O(m) + O((n-m)(m)) $$

The leading terms go away, and the last term becomes:

$O((n-m)(m)) \sim O(nm) - O(m^2)$

Because we know that n is much greater than m, and because of the leading minus term, the O(m^2) term goes away as well. The final analysis is that the algorithm cost is $$ O(nm) $$ .

Algorithmic Analysis of Substring Pattern Matching

From charlesreid1