Good for text classification
gzip with a k-nearest-neighbor classifier. The method achieves competitive results with non-pretrained deep learning methods on six in-distribution datasets and outperforms BERT on all five out-of-distribution datasets, including four low-resource languages
David Sauerwein on LinkedIn: What a fantastic result! A 0-parameter 14-line Python script using gzip… | 178 comments
What a fantastic result! A 0-parameter 14-line Python script using gzip compression and K-nearest neighbors outperforms a 345 Million-parameter transformer… | 178 comments on LinkedIn
https://www.linkedin.com/feed/update/urn:li:activity:7085349430819700737/
aclanthology.org
https://aclanthology.org/2023.findings-acl.426.pdf
Luke Gessler | @lgessler@lingo.lol on Twitter
this paper's nuts. for sentence classification on out-of-domain datasets, all neural (Transformer or not) approaches lose to good old kNN on representations generated by.... gzip https://t.co/6eZiXlJxOX pic.twitter.com/sF9kd1FzI4— Luke Gessler | @lgessler@lingo.lol (@LukeGessler) July 12, 2023
https://twitter.com/LukeGessler/status/1679211291292889100

Seonglae Cho