llama.cpp/ggml
Francis Couture-Harpin fb43d5e8b5 ggml-cuda : cleanup TQ2_0
This also removes custom TQ2_0 mmq dp4a,
because re-using the one from Q8_0 allows avoiding
to repeatedly unpack the 2-bit values to 8-bit
and instead only do it once per tile.
2025-01-09 12:16:02 -05:00
..
include tts : add OuteTTS support (#10784) 2024-12-18 19:27:21 +02:00
src ggml-cuda : cleanup TQ2_0 2025-01-09 12:16:02 -05:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : fix arm build (#10890) 2024-12-18 23:21:42 +01:00