Default Branch

924518e2e5 · Reset color before we exit (#11205) · Updated 2025-01-12 19:23:10 +01:00

Branches

eb0bf32caf · server: tests: schedule slow dispatch only on release or on demand · Updated 2024-03-02 23:18:31 +01:00

2145
1

0b673ca187 · s/_MODEL_CLASSES/_model_classes/ · Updated 2024-03-02 18:14:37 +01:00

2158
3

d4dfc250cc · Fix ARM_NEON · Updated 2024-03-02 09:12:02 +01:00

2163
7

f8ab539190 · convert : update help string · Updated 2024-03-01 18:29:34 +01:00

2161
3

9862d59c05 · llama : change starcoder2 rope type · Updated 2024-03-01 14:10:31 +01:00

2170
8

f0cbb6ddf6 · iq1_s: turn off SIMD implementation for QK_K = 64 (it does not work) · Updated 2024-02-28 07:28:10 +01:00

2185
6

14d757066b · llama : add llama_kv_cache_compress (EXPERIMENTAL) · Updated 2024-02-27 15:24:40 +01:00

2186
1

608f449880 · swift : fix build · Updated 2024-02-23 18:02:09 +01:00

2217
4

56c047156a · py : minor fixes · Updated 2024-02-22 18:22:56 +01:00

2226
1

5271c75666 · llama : fix K-shift with quantized K (wip) · Updated 2024-02-22 00:28:42 +01:00

2234
1

f249c997a8 · llama : adapt to F16 KQ_pos · Updated 2024-02-19 12:31:02 +01:00

2272
62

412735ec70 · Merge branch 'master' into gg/metal-batched · Updated 2024-02-19 10:25:24 +01:00

2272
6

47c662b0de · fix some spaces added by IDE in math op · Updated 2024-02-18 21:40:35 +01:00

2282
4

974e3cadff · ggml : try another fix · Updated 2024-02-17 17:14:35 +01:00

2301
2

e856bfed3b · hf : add support for --repo and --file · Updated 2024-02-15 14:05:15 +01:00

2315
3

ccd757a174 · convert : fix mistakes from refactoring · Updated 2024-02-13 18:01:30 +01:00

2323
4

5c977221d2 · iq1_s: slightly faster dot product · Updated 2024-02-13 14:18:27 +01:00

2329
15

4246b71ad7 · Fix compiler warnings (shadow variable) · Updated 2024-02-13 07:44:56 +01:00

2332
1

7286b83d3f · BERT WIP · Updated 2024-02-06 23:10:11 +01:00

2385
1

adcf16fd68 · py : fix empty bytes arg · Updated 2024-02-05 18:53:07 +01:00

2395
2