Stringzilla - Stupid Heuristics to Search & Sort Strings 5-10x Faster
A few years ago, I used a trivial heuristic in combination with SIMD intrinsics to showcase the untapped potential of modern CPUs. I benchmarked the qsort of LibC and the std::search of the C++ Standard Templates Library, resulting in ~1.5 GB/s throughput for substring search on a single core. Not bad, but the memory bandwidth would be closer to 10-15 GB/s per core. I’ve assumed that if the first 4 characters of the string match, the rest is also likely to match.
https://ashvardanian.com/posts/stringzilla/