Default Branch

d79d8f39b4 · vulkan: multi-row k quants (#10846) · Updated 2024-12-26 16:54:44 +01:00

Branches

2fcdf869cd · batched-bench : add mmq CLI arg · Updated 2023-10-11 18:42:33 +02:00    Mirrors

3028
7

ee7456926e · ggml-alloc : fix assert in debug builds · Updated 2023-10-09 14:33:12 +02:00    Mirrors

3037
1

ee268b5446 · llama : no longer perform uninitialized access to the KV cache · Updated 2023-10-08 10:49:38 +02:00    Mirrors

3044
5

acead654d2 · Merge branch 'master' into fix-refact · Updated 2023-10-08 10:25:16 +02:00    Mirrors

3044
4

6b9554a740 · metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7 · Updated 2023-10-08 08:55:13 +02:00    Mirrors

3051
5

ba44776dc2 · bump version · Updated 2023-10-07 20:47:48 +02:00    Mirrors

3050
6

5ab6c2132a · server-parallel : add "--reverse-prompt" + compiler warning fixes · Updated 2023-10-06 13:32:19 +02:00    Mirrors

3063
4

5418932b71 · llama : fix comments for llama_kv_cache API · Updated 2023-10-03 20:01:52 +02:00    Mirrors

3088
5

c5650ed470 · server : avoid context swaps by shifting the KV cache · Updated 2023-09-28 18:03:36 +02:00    Mirrors

3112
57

72e7ef4e53 · simple : fixes · Updated 2023-09-26 23:19:36 +02:00    Mirrors

3138
48

784d14ed31 · llama : store non-RoPEd K cache (WIP) · Updated 2023-09-17 22:43:07 +02:00    Mirrors

3150
5

92a4f86879 · llama : make starcoder graph build more consistent with others · Updated 2023-09-15 16:57:10 +02:00    Mirrors

3160
20

e7e7b11455 · llama : remove experimental stuff · Updated 2023-09-14 21:52:01 +02:00    Mirrors

3172
3

2f689dee06 · metal : minor · Updated 2023-09-07 14:33:21 +02:00    Mirrors

3205
5

30ac7a4117 · gitignore : metal · Updated 2023-09-04 21:23:16 +02:00    Mirrors

3217
12

f3a84b2e0d · llama : better express the KV cache dependencies in the graph · Updated 2023-09-04 20:44:48 +02:00    Mirrors

3217
5

c79d130f74 · make : fix speculative build · Updated 2023-09-04 14:50:04 +02:00    Mirrors

3218
9

847896aba7 · speculative : add --draft CLI arg · Updated 2023-09-03 12:51:07 +02:00    Mirrors

3224
3

8c2b881281 · cuda : poc for norm quants (only -b 1 works) · Updated 2023-08-30 20:42:28 +02:00    Mirrors

3265
3

b4e70822f6 · metal : add poc for normalized Q4_0 and Q4_1 · Updated 2023-08-30 17:47:16 +02:00    Mirrors

3265
7