mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2025-01-31 06:03:11 +01:00
readme : update Metal instructions
This commit is contained in:
parent
23360b15b6
commit
ac4038aab1
26
README.md
26
README.md
@ -280,29 +280,11 @@ In order to build llama.cpp you have three different options.
|
|||||||
|
|
||||||
### Metal Build
|
### Metal Build
|
||||||
|
|
||||||
Using Metal allows the computation to be executed on the GPU for Apple devices:
|
On MacOS, Metal is enabled by default. Using Metal makes the computation run on the GPU.
|
||||||
|
To disable the Metal build at compile time use the `LLAMA_NO_METAL=1` flag or the `LLAMA_METAL=OFF` cmake option.
|
||||||
|
|
||||||
- Using `make`:
|
When built with Metal support, you can explicitly disable GPU inference with the `--gpu-layers|-ngl 0` command-line
|
||||||
|
argument.
|
||||||
```bash
|
|
||||||
LLAMA_METAL=1 make
|
|
||||||
```
|
|
||||||
|
|
||||||
- Using `CMake`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
mkdir build-metal
|
|
||||||
cd build-metal
|
|
||||||
cmake -DLLAMA_METAL=ON ..
|
|
||||||
cmake --build . --config Release
|
|
||||||
```
|
|
||||||
|
|
||||||
When built with Metal support, you can enable GPU inference with the `--gpu-layers|-ngl` command-line argument.
|
|
||||||
Any value larger than 0 will offload the computation to the GPU. For example:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
./main -m ./models/7B/ggml-model-q4_0.gguf -n 128 -ngl 1
|
|
||||||
```
|
|
||||||
|
|
||||||
### MPI Build
|
### MPI Build
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user