llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-27 12:33:06 +01:00

History

Francis Couture-Harpin fb43d5e8b5 ggml-cuda : cleanup TQ2_0 This also removes custom TQ2_0 mmq dp4a, because re-using the one from Q8_0 allows avoiding to repeatedly unpack the 2-bit values to 8-bit and instead only do it once per tile.		2025-01-09 12:16:02 -05:00
..
include	tts : add OuteTTS support (#10784 )	2024-12-18 19:27:21 +02:00
src	ggml-cuda : cleanup TQ2_0	2025-01-09 12:16:02 -05:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : fix arm build (#10890 )	2024-12-18 23:21:42 +01:00