Default Branch

c05e8c9934 · gguf-py: fixed local detection of gguf package (#11180) · Updated 2025-01-11 10:42:31 +01:00

Branches

fbddb26250 · ggml-cuda : use i and j instead of i0 and i in vec_dot_tq2_0_q8_1 · Updated 2025-01-12 03:06:49 +01:00

7
7

e7564c5023 · cmake : enable -Wshadow for C++ code [no ci] · Updated 2025-01-11 16:52:45 +01:00

0
1

6540935bca · vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174) · Updated 2025-01-11 16:52:08 +01:00

4
8

b6f9640157 · contrib : add TODO for preprocessor directives [no ci] · Updated 2025-01-11 16:07:45 +01:00

2
9

c9d1eb3a06 · Added the ability to use guide tokens for OuteTTS, greatly improving TTS recitation accuracy over long input sequences. · Updated 2025-01-11 09:55:31 +01:00

1
1

15fbcb5df7 · wip: add cencellable request · Updated 2025-01-10 15:23:13 +01:00

2
1

9605c5fb28 · cmake : remove explicit _XOPEN_SOURCE · Updated 2025-01-06 12:02:48 +01:00

52
2

747c85d460 · llama : remove notion of CLS token · Updated 2025-01-06 09:58:01 +01:00

38
1

aa014d7e89 · Use mutex instead of atomics for vk_instance counters · Updated 2024-12-30 06:14:58 +01:00

65
2

a362c74aa2 · profiler: initial support for profiling graph ops · Updated 2024-12-27 00:59:37 +01:00

69
1

fe9235d795 · Force max subgroup size for coopmat shaders · Updated 2024-12-18 08:26:27 +01:00

111
1

4fbb801a9d · ggml : update ggml_backend_cpu_device_supports_op · Updated 2024-12-17 17:09:02 +01:00

121
3

3e92f4ecbe · cont [no ci] · Updated 2024-12-15 11:36:03 +01:00

133
2

7e9208e408 · scripts : change build path to "build-bench" for compare-commits.sh · Updated 2024-12-15 10:47:30 +01:00

133
1

fb18934a97 · gguf-py : bump version to 0.11.0 · Updated 2024-12-11 22:13:31 +01:00

154
0
Included

4f3a7e279b · Force max subgroup size for coopmat shaders · Updated 2024-12-10 21:27:04 +01:00

162
2

b8d1b1a5e1 · server : fix infill prompt format · Updated 2024-12-08 21:12:11 +01:00

172
1

a6648b9df7 · server : chunked prefill support · Updated 2024-12-08 08:48:18 +01:00

176
1

a8046c888a · use calloc instead of malloc · Updated 2024-12-04 17:24:35 +01:00

207
3

81611bef72 · server : add tests · Updated 2024-12-04 12:11:26 +01:00

207
3