diff --git a/Feature-matrix.md b/Feature-matrix.md new file mode 100644 index 0000000..7e6eb66 --- /dev/null +++ b/Feature-matrix.md @@ -0,0 +1,11 @@ +# llama.cpp feature matrix + +| | **CPU (AVX2)** | **CPU (ARM NEON)** | **Metal** | **cuBLAS** | **rocBLAS** | **SYCL** | **CLBlast** | **Vulkan** | **Kompute** | +|:--------------------:|:--------------:|:------------------:|:---------:|:----------:|:----------------:|:--------:|:-----------:|:----------:|:-----------:| +| **K-quants** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚫 | +| **I-quants** | ✅ (SLOW) | ✅ (SLOW) | ✅ (SLOW) | ✅ | ✅ | Partial¹ | 🚫 | 🚫 | 🚫 | +| **Multi-GPU** | N/A | N/A | N/A | ✅ | ❓ | 🚫 | ❓ | ✅ | ❓ | +| **K cache quants** | ✅ | ❓ | ❓ | ✅ | Only q8_0 (SLOW) | ❓ | ✅ | 🚫 | 🚫 | +| **MoE architecture** | ✅ | ❓ | ✅ | ✅ | ✅ | ❓ | Only -ngl 0 | 🚫 | 🚫 | + +* ¹: IQ3_S and IQ1_S, see #5886 \ No newline at end of file