llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-26 14:20:31 +01:00

master

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 21:33:04 +01:00

gg/flash-attn-mask-f16 1ad42b1f1e · ggml : ggml_soft_max uses F16 mask · Updated 2024-01-31 19:33:59 +01:00	2355 36	ZIP TAR.GZ
ik/fix_iq3xxs_metal 719a087138 · iq3_xxs: forgotten update of the grid points · Updated 2024-01-30 17:39:07 +01:00	2369 1	ZIP TAR.GZ
gg/flash-attn-simd 2bf91c5306 · metal : clean up · Updated 2024-01-25 12:29:45 +01:00	2465 23	ZIP TAR.GZ
gg/flash-attn-wip3 6ccbd1777a · wip · Updated 2024-01-24 14:45:04 +01:00	2465 18	ZIP TAR.GZ
gg/flash-attn-wip4 da23b56f25 · wip : no ic 8 step · Updated 2024-01-24 12:25:34 +01:00	2465 18	ZIP TAR.GZ
gg/flash-attn-wip2 06c2d0d117 · wip · Updated 2024-01-23 21:42:43 +01:00	2465 14	ZIP TAR.GZ
gg/flash-attn-online a9681febd6 · ggml : online attention (CPU) · Updated 2024-01-20 15:45:41 +01:00	2465 4	ZIP TAR.GZ
ceb/fix-msvc-build 32a392fe68 · try a differerent fix · Updated 2024-01-19 23:10:23 +01:00	2466 2	ZIP TAR.GZ
ceb/restore-convert 4a3bc1522e · py : linting with mypy and isort · Updated 2024-01-19 21:18:58 +01:00	2467 3	ZIP TAR.GZ
ceb/nomic-vulkan-fix-add 1453215165 · kompute : fix ggml_add kernel · Updated 2024-01-18 23:09:16 +01:00	2583 105	ZIP TAR.GZ
ik/faster_hellaswag ccc78a200e · hellaswag: speed up even more by parallelizing log-prob evaluation · Updated 2024-01-18 17:25:29 +01:00	2483 1	ZIP TAR.GZ
gg/imatrix-gpu-4931 2917e6b528 · Merge branch 'master' into gg/imatrix-gpu-4931 · Updated 2024-01-17 17:43:45 +01:00	2490 10	ZIP TAR.GZ
gg/fix-spm-added-tokens-dict-4958 23742deb5b · py : fix padded dummy tokens (I hope) · Updated 2024-01-17 14:44:22 +01:00	2509 4	ZIP TAR.GZ
ik/better_q2_k_s 9fd1e83f6d · Use Q4_K for attn_v for Q2_K_S when n_gqa >= 4 · Updated 2024-01-17 11:16:08 +01:00	2495 1	ZIP TAR.GZ
gg/iq2-refactor-and-tests 49bafe0986 · tests : avoid creating RNGs for each tensor · Updated 2024-01-17 09:40:55 +01:00	2498 6	ZIP TAR.GZ
ik/imatrix_legacy_quants bb9abb5cd8 · imatrix: guard Q4_0/Q5_0 against ffn_down craziness · Updated 2024-01-16 08:56:05 +01:00	2512 2	ZIP TAR.GZ
gg/add-phixtral 9998ecd191 · llama : add phixtral support (wip) · Updated 2024-01-13 13:24:07 +01:00	2542 1	ZIP TAR.GZ
gg/update-phi2-convert 1fb563ebdc · py : try to fix flake stuff · Updated 2024-01-13 12:42:35 +01:00	2543 2	ZIP TAR.GZ
ik/iq2_2.31bpw 9bfcb16fd3 · Add llama enum for IQ2_XS · Updated 2024-01-11 17:24:12 +01:00	2592 11	ZIP TAR.GZ
gg/server-infill-empty-prompt-4027 24096933b0 · server : try to fix infill when prompt is empty · Updated 2024-01-09 10:27:29 +01:00	2594 1	ZIP TAR.GZ

... 10 11 12 13 14 ...

Default Branch

Branches