Default Branch

d79d8f39b4 · vulkan: multi-row k quants (#10846) · Updated 2024-12-26 16:54:44 +01:00

Branches

e1241d9b46 · metal : switch to execution barriers + fix one of the barriers · Updated 2023-12-13 12:56:45 +01:00    Mirrors

2773
47

fc5f334689 · readme : add API change notice · Updated 2023-12-07 11:35:02 +01:00    Mirrors

2775
15

af99c6fbfc · llama : remove memory_f16 and kv_f16 flags · Updated 2023-12-05 17:18:16 +01:00    Mirrors

2787
26

3cb1c348b3 · metal : try to improve batched decoding · Updated 2023-12-01 21:01:58 +01:00    Mirrors

2792
2

eb594c0f7d · alloc : fix build with debug · Updated 2023-12-01 09:46:05 +01:00    Mirrors

2816
14

5b74310e6e · build : enable libstdc++ assertions for debug builds · Updated 2023-12-01 00:18:24 +01:00    Mirrors

2801
1

bb39b87964 · ggml : restore abort() in GGML_ASSERT · Updated 2023-11-28 01:27:09 +01:00    Mirrors

2820
1

87f4102a70 · llama : revert n_threads_batch logic · Updated 2023-11-27 20:47:35 +01:00    Mirrors

2821
3

6272b6764a · use stride=128 if built for tensor cores · Updated 2023-11-27 19:09:14 +01:00    Mirrors

2824
3

8d8b76d469 · lookahead : add comments · Updated 2023-11-26 10:26:55 +01:00    Mirrors

2836
9

21b70babf7 · straightforward /v1/models endpoint · Updated 2023-11-24 17:22:39 +01:00    Mirrors

2837
12

f8e9f11428 · common : add -dkvc arg for enabling kv cache dumps · Updated 2023-11-23 17:47:56 +01:00    Mirrors

2843
4

f824902623 · YaRN : correction to GPT-NeoX implementation · Updated 2023-11-15 23:10:52 +01:00    Mirrors

2875
1

d0445a2eff · better documentation · Updated 2023-11-10 01:38:20 +01:00    Mirrors

2892
3

47d604fa2d · fix issues · Updated 2023-11-05 13:20:22 +01:00    Mirrors

2906
3

3ef358fffd · Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)" · Updated 2023-11-04 21:26:51 +01:00    Mirrors

2910
2

46868a499e · metal : multi-simd softmax · Updated 2023-11-01 20:16:34 +01:00    Mirrors

2935
1

a8796f9609 · llm : cleanup + comments · Updated 2023-11-01 19:08:02 +01:00    Mirrors

2944
4

7420bef83e · wip wip wip · Updated 2023-11-01 07:51:43 +01:00    Mirrors

2944
1

afb3929279 · Merge branch 'master' into llama-refactor · Updated 2023-10-31 19:35:31 +01:00    Mirrors

2946
21