llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-11 21:10:24 +01:00

master

c05e8c9934 · gguf-py: fixed local detection of gguf package (#11180) · Updated 2025-01-11 10:42:31 +01:00

compilade/bitnet-ternary 75b3a09602 · test-backend-ops : add TQ1_0 and TQ2_0 comments for later · Updated 2024-09-04 21:00:21 +02:00	795 33	ZIP TAR.GZ
gg/llama-refactor-sampling f648ca2cee · llama : add llama_sampling API + move grammar in libllama · Updated 2024-09-03 09:31:54 +02:00	802 1	ZIP TAR.GZ
gg/llama-disambiguate 40fa68cb46 · readme : add API change notice · Updated 2024-09-02 17:32:24 +02:00	811 3	ZIP TAR.GZ
compilade/refactor-kv-cache 375de5b1f8 · llama : use unused n_embd_k_gqa in k_shift · Updated 2024-09-02 03:59:24 +02:00	811 41	ZIP TAR.GZ
gg/metal-fix-fa-2 a95225cdfd · metal : another fix for the fa kernel · Updated 2024-08-26 14:08:38 +02:00	835 1	ZIP TAR.GZ
gg/metal-fix-fa aa931d0375 · metal : fix fa kernel · Updated 2024-08-26 12:09:50 +02:00	835 1	ZIP TAR.GZ
sycl-onednn-convolution 6494509801 · backup · Updated 2024-08-26 10:58:54 +02:00	845 2	ZIP TAR.GZ
gg/remove-k-quants-per-iter ccb45186d0 · docs : remove references · Updated 2024-08-26 08:52:17 +02:00	839 2	ZIP TAR.GZ
compilade/batch-splits 8062650343 · llama : fix simple splits when the batch contains embeddings · Updated 2024-08-21 21:09:03 +02:00	850 19	ZIP TAR.GZ
sl/prepare-next-graph 9127800d83 · wip · Updated 2024-08-17 01:51:06 +02:00	883 2	ZIP TAR.GZ
gg/hf-test 62d7b6c87f · cuda : re-add q4_0 · Updated 2024-08-14 12:37:03 +02:00	879 3	ZIP TAR.GZ
compilade/fix-server-long-system-prompt 93ec58b932 · server : fix typo in comment · Updated 2024-08-14 04:12:26 +02:00	881 4	ZIP TAR.GZ
compilade/nul-str-token faaac59d16 · llama : support NUL bytes in tokens · Updated 2024-08-12 03:00:03 +02:00	892 1	ZIP TAR.GZ
compilade/gguf-py-dequant 73bc9350cd · gguf-py : Numpy dequantization for grid-based i-quants · Updated 2024-08-10 05:47:31 +02:00	912 2	ZIP TAR.GZ
compilade/faster-session-sizes 9329953a61 · llama : avoid double tensor copy when saving session to buffer · Updated 2024-08-07 22:03:34 +02:00	920 2	ZIP TAR.GZ
update_sycl_doc 7764ab911d · update guide · Updated 2024-08-07 16:01:02 +02:00	921 1	ZIP TAR.GZ
sl/dump-allocs cad8abb49b · add tool to allow plotting tensor allocation maps within buffers · Updated 2024-08-06 22:09:51 +02:00	929 1	ZIP TAR.GZ
prepare-PR-of-minicpm-v2.5-gg 6e299132e7 · clip : style changes · Updated 2024-08-06 10:44:29 +02:00	1253 56	ZIP TAR.GZ
fix_cmd_name 16dab13bde · correct cmd name · Updated 2024-08-05 18:15:33 +02:00	938 1	ZIP TAR.GZ
gg/replace-all bddcc5f985 · llama : better replace_all · Updated 2024-08-04 12:42:08 +02:00	954 1	ZIP TAR.GZ

... 2 3 4 5 6 ...

Default Branch

Branches