Regular Expression Matching Can Be Simple and Fast
Russ Cox writes:
This is a tale of two approaches to regular expression matching. One of them is in widespread use in the standard interpreters for many languages, including Perl. The other is used only in a few places, notably most implementations of awk and grep. The two approaches have wildly different performance characteristics … The trends shown in the graph continue: the Thompson NFA handles a 100-character string in under 200 microseconds, while Perl would require over 10^15 years.
Pretty much all modern regular expression implementations have the same problem. Thompson NFAs were introduced by Ken Thompson in the late 1960s.
Whoops.
Some days, when it comes to programming, it seems like we’ve forgotten more than we’ve ever learned. Another example: take the pool of programmers out there today, and find me one who understands binary arithmetic and bitwise operations. Sure, plenty still exist, but you’ll have to look for them…
(Hat Tip: Lambda the Ultimate)