1.58bit (-1, 0, 1)
dramatic memory efficiency
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.
https://arxiv.org/html/2402.17764v1
Model Olmo
NousResearch/OLMo-Bitnet-1B · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/NousResearch/OLMo-Bitnet-1B
OLMo-Bitnet
Publish your model insights with interactive plots for performance metrics, predictions, and hyperparameters. Made by Jeffrey Quesnelle using W&B
https://wandb.ai/emozilla/olmo/reports/OLMo-Bitnet--Vmlldzo3MzQ4NjQw?accessToken=t17hxb3r7iedofrmfpog1d3dpbmspp1gz6702uz8gyvokmeovznz5g0ndkhbb6r6

Finetune (74mb file, 198 tokens per second on just 1 cpu core)
nisten on Twitter / X
hacked bitnet for finetuning, ended up with a 74mb file. It talks fine at 198 tokens per second on just 1 cpu core. Basically witchcraft.opensourcing later via @skunkworks_ai base here: https://t.co/n4iddFDMSl pic.twitter.com/w0u7LJ5S3d— nisten (@nisten) July 31, 2024
https://x.com/nisten/status/1818529201231688139?t=a2_oszg66OrDGlwweQS1iQ&s=19
Fine-tuning LLMs to 1.58bit: extreme quantization made easy
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
https://huggingface.co/blog/1_58_llm_extreme_quantization
Deepseek R1
Run DeepSeek-R1 Dynamic 1.58-bit
DeepSeek R-1 is the most powerful open-source reasoning model that performs on par with OpenAI's o1 model. Run the 1.58-bit Dynamic GGUF version by Unsloth.
https://unsloth.ai/blog/deepseekr1-dynamic


Seonglae Cho