- GPTQ for int4
- bitsandbytes for int8, fp4, nf4
not available for
git clone https://github.com/Lightning-AI/lit-gpt cd lit-gpt # pip install pytorch pip install --index-url https://download.pytorch.org/whl/nightly/cu118 --pre 'torch>=2.1.0dev' pip install -r requirements.txt pip uninstall -y lightning; pip install -r requirements.txt pip install sentencepiece
Download mode and so annoying~
python scripts/download.py --repo_id openlm-research/open_llama_3b python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/openlm-research/open_llama_3b python quantize/gptq.py --checkpoint_dir checkpoints/openlm-research/open_llama_3b python generate/base.py --prompt "Hello, my name is" --checkpoint_dir checkpoints/openlm-research/open_llama_3b --quantize gptq.int4
mac
export PYTORCH_ENABLE_MPS_FALLBACK=1 --precision 16-true