From f50f534b0f5f9a721fe2e2049b111d35a6a7c2ae Mon Sep 17 00:00:00 2001 From: oobabooga <112222186+oobabooga@users.noreply.github.com> Date: Fri, 18 Aug 2023 09:37:20 -0700 Subject: [PATCH] Add note about AMD/Metal to README --- README.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 5b1e95c3..e7954718 100644 --- a/README.md +++ b/README.md @@ -88,7 +88,19 @@ cd text-generation-webui pip install -r requirements.txt ``` -#### Note about older NVIDIA GPUs +#### llama.cpp on AMD, Metal, and some specific CPUs + +Precompiled wheels are included for CPU-only and NVIDIA GPUs (cuBLAS). For AMD, Metal, and some specific CPUs, you need to uninstall those wheels and compile llama-cpp-python yourself. + +To uninstall: + +``` +pip uninstall -y llama-cpp-python llama-cpp-python-cuda +``` + +To compile: https://github.com/abetlen/llama-cpp-python#installation-with-openblas--cublas--clblast--metal + +#### bitsandbytes on older NVIDIA GPUs bitsandbytes >= 0.39 may not work. In that case, to use `--load-in-8bit`, you may have to downgrade like this: