Z algorithm
Introduction
There are some algorithms of exact substring searching (e.g. Knuth-Morris-Pratt, Boyer-Moore etc.) I want to explain one of them which is called Z algorithm in some sources.
Z-boxes and Z-values
Let’s consider the concept of Z-box. Take the string S = “abcxxxabyyy”
. We have an internal part “ab”
in the string which repeats its prefix. The internal “ab”
is a Z-box. It has the beginning at the position with index 6 and the end in 7 (0-based). Z-boxes are substrings which match string prefixes with the same length. For the Z-box, let’s call the corresponding prefix the prototype (it’s my term but I think it’s OK to use it.)