Updated Feature matrix (markdown)

2024-11-24 09:06:52 +01:00 · 2024-03-21 17:22:48 +00:00 · 2024-03-21 17:22:48 +00:00 · 3532deb5dc
commit 3532deb5dc
parent e4f77876f9
1 changed files with 4 additions and 3 deletions
--- a/Feature-matrix.md
+++ b/Feature-matrix.md
@ -3,7 +3,7 @@
 | **K-quants**         | ✅              | ✅                  | ✅         | ✅          | ✅                | ✅        | ✅ 🐢⁵          | ✅ 🐢⁵         | 🚫           |
 | **I-quants**         | ✅ 🐢⁴       | ✅ 🐢⁴           | ✅ 🐢⁴ | ✅          | ✅                | Partial¹        | 🚫           | 🚫          | 🚫           |
 | **Multi-GPU**        | N/A            | N/A                | N/A       | ✅          | ❓                | 🚫        | ❓           | ✅          | ❓           |
-|  **K cache quants**  | ✅              | ❓                  | ❓         | ✅          | Partial³ 🐢 | ❓        | ✅           | 🚫          | 🚫           |
+|  **K cache quants**  | ✅              | ❓                  | ❓         | ✅ 🐢³          | Partial⁶ 🐢³ | ❓        | ✅           | 🚫          | 🚫           |
 | **MoE architecture** | ✅              | ❓                  | ✅         | ✅          | ✅                | ❓        | Partial² | 🚫          | 🚫           |
 * ✅: feature works
@ -12,6 +12,7 @@
 * 🐢: feature is slow
 * ¹: IQ3_S and IQ1_S, see #5886
 * ²: Only with `-ngl 0`
-* ³: Only `-ctk q8_0`, inference is 50% slower
+* ³: Inference is 50% slower
 * ⁴: Slower than K-quants of comparable size
 * ⁵: Slower than cuBLAS/rocBLAS on similar cards
 * ⁶: Only q8_0 and iq4_nl