llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-10 04:20:24 +01:00

History

Kerfuffle 1b78ed2081

Only show -ngl option when relevant + other doc/arg handling updates (#1625 )

1. Add a `LLAMA_SUPPORTS_GPU_OFFLOAD` define to `llama.h` (defined when compiled with CLBlast or cuBLAS)
2. Update the argument handling in the common example code to only show the `-ngl`, `--n-gpu-layers` option when GPU offload is possible.
3. Add an entry for the `-ngl`, `--n-gpu-layers` option to the `main` and `server` examples documentation
4. Update `main` and `server` examples documentation to use the new style dash separator argument format
5. Update the `server` example to use dash separators for its arguments and adds `-ngl` to `--help` (only shown when compiled with appropriate support). It will still support `--memory_f32` and `--ctx_size` for compatibility.
6. Add a warning discouraging use of `--memory-f32` for the `main` and `server` examples `--help` text as well as documentation. Rationale: https://github.com/ggerganov/llama.cpp/discussions/1593#discussioncomment-6004356

2023-05-28 11:48:57 -06:00

baby-llama

ggml : implement backward pass for llama + small training-llama-from-scratch example (#1360 )

2023-05-13 15:56:40 +03:00

benchmark

llama : add llama_init_backend() API (close #1527 )

2023-05-20 11:06:37 +03:00

embedding

llama : add llama_init_backend() API (close #1527 )

2023-05-20 11:06:37 +03:00

jeopardy

examples : add Jeopardy example (#1168 )