llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-25 05:48:47 +01:00

master

9ba399dfa7 · server : add support for "encoding_format": "base64" to the */embeddings endpoints (#10967) · Updated 2024-12-24 21:33:04 +01:00

sl/detect-imatrix-nan d857e5192e · quantize : check imatrix for nan/inf values · Updated 2024-06-06 22:44:24 +02:00	1290 2	ZIP TAR.GZ
gg/http-threads 731e7528be · server : fix --threads-http arg · Updated 2024-06-06 15:37:12 +02:00	1291 1	ZIP TAR.GZ
sl/fix-docker-main-server-build f7d4b7c343 · build only main and server in their docker images · Updated 2024-06-06 00:13:01 +02:00	1298 2	ZIP TAR.GZ
sl/fix-docker-omp 3d2e79da7f · add openmp lib to dockerfiles · Updated 2024-06-06 00:05:25 +02:00	1298 1	ZIP TAR.GZ
gg/server-v1-completion 0085f94936 · server : add /v1/completion endpoint · Updated 2024-06-04 14:58:14 +02:00	1308 1	ZIP TAR.GZ
sl/rpc-backend-cpy 5f8720fb7b · add rpc-server to Makefile · Updated 2024-05-31 17:22:05 +02:00	1343 3	ZIP TAR.GZ
gg/server-update-js 956af1552a · server : update js · Updated 2024-05-31 14:47:19 +02:00	1334 1	ZIP TAR.GZ
gg/ci-loongson 77c16ee0d4 · tests : disable json test due to lack of python on the CI node · Updated 2024-05-31 13:16:54 +02:00	1347 3	ZIP TAR.GZ
sycl-global-variables d32a8f6142 · backup · Updated 2024-05-31 10:51:56 +02:00	1344 2	ZIP TAR.GZ
gg/cache-token-to-piece 8a8f8b953f · llama : print a log of the total cache size · Updated 2024-05-29 20:45:43 +02:00	1353 4	ZIP TAR.GZ
sl/cuda-fattn-par-test 1ca802a3e0 · parallelize fattn compilation test · Updated 2024-05-28 01:19:36 +02:00	1379 6	ZIP TAR.GZ
compilade/refactor-kv-cache-gg ddc59e8e0a · wipwipwiwpip · Updated 2024-05-27 11:04:09 +02:00	1401 17	ZIP TAR.GZ
fix_q_xxs_mul_mat 4b1770109c · Fix q_xxs using mul_mat_q · Updated 2024-05-27 10:46:37 +02:00	1384 1	ZIP TAR.GZ
gg/metal-disable-fa-256 1c6cde92bb · metal : disable FA kernel for HS=256 · Updated 2024-05-27 08:57:20 +02:00	1386 1	ZIP TAR.GZ
compilade/lazier-moe-convert-hf 11f78c6a2d · convert-hf : adapt ArcticModel to use yield too · Updated 2024-05-25 18:52:53 +02:00	1393 4	ZIP TAR.GZ
7507-main-intel-dockerfile dd14d818e0 · Update main-intel.Dockerfile base image to 2024.1.0 · Updated 2024-05-24 04:47:58 +02:00	1404 1	ZIP TAR.GZ
compilade/gguf-py-fix-q-shape c5fe1d6cdc · gguf-py : remove unused import · Updated 2024-05-23 06:09:49 +02:00	1419 2	ZIP TAR.GZ
sl/cuda-uma 518b75260b · cuda uma test · Updated 2024-05-23 03:13:48 +02:00	1419 1	ZIP TAR.GZ
sl/dio-test e9095e6098 · async direct io per tensor test · Updated 2024-05-22 01:08:52 +02:00	1438 3	ZIP TAR.GZ
gg/kv-determinism a041ced0fd · wip · Updated 2024-05-20 17:20:49 +02:00	1444 1	ZIP TAR.GZ

... 5 6 7 8 9 ...

Default Branch

Branches