Default Branch

32d6ee6385 · ggml : fix const usage in SSE path (#10962) · Updated 2024-12-23 20:25:52 +01:00

Branches

bb0b2c4f56 · llama : context · Updated 2024-12-23 20:05:54 +01:00

3
18

2c22d1f63f · ggml : fix arm enabled features check · Updated 2024-12-23 18:35:53 +01:00

1
1

08cdb66490 · ggml : use wstring for backend search paths · Updated 2024-12-23 18:12:55 +01:00

1
1

9d5c711587 · llama : the WPM vocabs use the CLS token as BOS · Updated 2024-12-21 09:22:04 +01:00

11
1

fe9235d795 · Force max subgroup size for coopmat shaders · Updated 2024-12-18 08:26:27 +01:00

34
1

4fbb801a9d · ggml : update ggml_backend_cpu_device_supports_op · Updated 2024-12-17 17:09:02 +01:00

44
3

3e92f4ecbe · cont [no ci] · Updated 2024-12-15 11:36:03 +01:00

56
2

7e9208e408 · scripts : change build path to "build-bench" for compare-commits.sh · Updated 2024-12-15 10:47:30 +01:00

56
1

fb18934a97 · gguf-py : bump version to 0.11.0 · Updated 2024-12-11 22:13:31 +01:00

77
0
Included

4f3a7e279b · Force max subgroup size for coopmat shaders · Updated 2024-12-10 21:27:04 +01:00

85
2

1bf38cffdf · server/bench: · Updated 2024-12-10 17:18:16 +01:00

90
1

b8d1b1a5e1 · server : fix infill prompt format · Updated 2024-12-08 21:12:11 +01:00

95
1

a6648b9df7 · server : chunked prefill support · Updated 2024-12-08 08:48:18 +01:00

99
1

a8046c888a · use calloc instead of malloc · Updated 2024-12-04 17:24:35 +01:00

130
3

81611bef72 · server : add tests · Updated 2024-12-04 12:11:26 +01:00

130
3

33d7b70c88 · server : do not speculate during prompt processing · Updated 2024-12-03 09:58:43 +01:00

143
1

335f48ae16 · Make sure Vulkan instance is destroyed properly on program exit · Updated 2024-11-29 08:42:00 +01:00

168
1

3c8a2a83fe · shmem experiments · Updated 2024-11-26 14:17:38 +01:00

206
3

dafedd33d2 · 4x4 -> 4x · Updated 2024-11-26 13:54:02 +01:00

206
2

bf3494345e · metal : some mul_mv experiments · Updated 2024-11-26 13:48:50 +01:00

206
1