mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-12-24 13:28:50 +01:00
Update README.md (#3289)
* Update README.md * Update README.md Co-authored-by: slaren <slarengh@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com>
This commit is contained in:
parent
36b904e200
commit
bc9d3e3971
@ -557,6 +557,10 @@ python3 convert.py models/7B/
|
||||
# quantize the model to 4-bits (using q4_0 method)
|
||||
./quantize ./models/7B/ggml-model-f16.gguf ./models/7B/ggml-model-q4_0.gguf q4_0
|
||||
|
||||
# update the gguf filetype to current if older version is unsupported by another application
|
||||
./quantize ./models/7B/ggml-model-q4_0.gguf ./models/7B/ggml-model-q4_0-v2.gguf COPY
|
||||
|
||||
|
||||
# run the inference
|
||||
./main -m ./models/7B/ggml-model-q4_0.gguf -n 128
|
||||
```
|
||||
|
Loading…
Reference in New Issue
Block a user