The main goal of llama.cpp is to run the LLaMA model using 4-bit integer quantization on a MacBook
- chat.py
- test_inference.py
- convert.py
Seonglae Cho
Seonglae Chollama.cpp is to run the LLaMA model using 4-bit integer quantization on a MacBook