Default Branch

8f275a7c45 · ggml: Add POOL2D OP for GPU acceleration to the Vulkan backend in the MobileVLM model. (#9763) · Updated 2024-10-29 09:52:56 +01:00

Branches

bb39b87964 · ggml : restore abort() in GGML_ASSERT · Updated 2023-11-28 01:27:09 +01:00    Mirrors

2416
1

87f4102a70 · llama : revert n_threads_batch logic · Updated 2023-11-27 20:47:35 +01:00    Mirrors

2417
3

6272b6764a · use stride=128 if built for tensor cores · Updated 2023-11-27 19:09:14 +01:00    Mirrors

2420
3

8d8b76d469 · lookahead : add comments · Updated 2023-11-26 10:26:55 +01:00    Mirrors

2432
9

21b70babf7 · straightforward /v1/models endpoint · Updated 2023-11-24 17:22:39 +01:00    Mirrors

2433
12

f8e9f11428 · common : add -dkvc arg for enabling kv cache dumps · Updated 2023-11-23 17:47:56 +01:00    Mirrors

2439
4

f824902623 · YaRN : correction to GPT-NeoX implementation · Updated 2023-11-15 23:10:52 +01:00    Mirrors

2471
1

d0445a2eff · better documentation · Updated 2023-11-10 01:38:20 +01:00    Mirrors

2488
3

47d604fa2d · fix issues · Updated 2023-11-05 13:20:22 +01:00    Mirrors

2502
3

3ef358fffd · Revert "cuda : use CUDA memory pool with async memory allocation/deallocation when available (#3903)" · Updated 2023-11-04 21:26:51 +01:00    Mirrors

2506
2

46868a499e · metal : multi-simd softmax · Updated 2023-11-01 20:16:34 +01:00    Mirrors

2531
1

a8796f9609 · llm : cleanup + comments · Updated 2023-11-01 19:08:02 +01:00    Mirrors

2540
4

7420bef83e · wip wip wip · Updated 2023-11-01 07:51:43 +01:00    Mirrors

2540
1

afb3929279 · Merge branch 'master' into llama-refactor · Updated 2023-10-31 19:35:31 +01:00    Mirrors

2542
21

29fe516913 · wip · Updated 2023-10-31 17:36:37 +01:00    Mirrors

2543
1

dab42893c9 · scripts : working curl pipe · Updated 2023-10-31 16:03:56 +01:00    Mirrors

2543
3

7923b70cb8 · llama : add llm_build_inp_embd helper · Updated 2023-10-31 15:43:08 +01:00    Mirrors

2548
37

4b3cb98d46 · ggml-impl : move extern "C" to start of file · Updated 2023-10-30 18:05:58 +01:00    Mirrors

2544
7
lto

bc28aaa8c2 · make : use -lfto=auto to avoid warnings and maintain perf · Updated 2023-10-30 15:00:53 +01:00    Mirrors

2544
5

15267192c0 · llama : refactor tensor offloading as callback · Updated 2023-10-29 12:04:36 +01:00    Mirrors

2548
15