Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

Probably a pretty bad test of the _actual_ speed of grep. A more realistic test would have printable characters and frequent newlines. I wouldn't be surprised if all those bytes were just taking a fast-path shortcut somewhere.


Depending on many factors (like details of the patterns used and the input), some regex engines (like Hyperscan) can match tens of gigabytes per second per core. Shockingly fast!


Grep is fast. Like obviously in this case you’re ‘just’ measuring how fast you can read from a pipe, but there are plenty of ways grep could have been implemented that would have been slower. Generally, I think grep will convert queries into a form that can be searched for reasonably efficiently (eg KMP for longer strings (bit of a guess – not sure how good it is on modern hardware), obviously no backtracking for regular expressions.


I don't think KMP has been used in any practical substring implementation in ages. At least I'm not aware of one. I believe GNU grep uses Boyer-Moore, but that's not really the key here. The key is using memchr in BM's skip loop.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: