llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-02-06 08:30:33 +01:00

History

Rémy Oudompheng 66ee4f297c vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360 ) * vulkan: initial support for IQ3_S * vulkan: initial support for IQ3_XXS * vulkan: initial support for IQ2_XXS * vulkan: initial support for IQ2_XS * vulkan: optimize Q3_K by removing branches * vulkan: implement dequantize variants for coopmat2 * vulkan: initial support for IQ2_S * vulkan: vertically realign code * port failing dequant callbacks from mul_mm * Fix array length mismatches * vulkan: avoid using workgroup size before it is referenced * tests: increase timeout for Vulkan llvmpipe backend --------- Co-authored-by: Jeff Bolz <jbolz@nvidia.com>		2025-01-29 18:29:39 +01:00
..
bench.yml.disabled	ggml-backend : add device and backend reg interfaces (#9707 )	2024-10-03 01:49:47 +02:00
build.yml	vulkan: implement initial support for IQ2 and IQ3 quantizations (#11360 )	2025-01-29 18:29:39 +01:00
close-issue.yml	ci : fine-grant permission (#9710 )	2024-10-04 11:47:19 +02:00
docker.yml	ci : fix build CPU arm64 (#11472 )	2025-01-29 00:02:56 +01:00
editorconfig.yml	ci : pin dependency to specific version (#11137 )	2025-01-08 12:07:20 +01:00
gguf-publish.yml	ci : update checkout, setup-python and upload-artifact to latest (#6456 )	2024-04-03 21:01:13 +03:00
labeler.yml	labeler.yml: Use settings from ggerganov/llama.cpp [no ci] (#7363 )	2024-05-19 20:51:03 +10:00
python-check-requirements.yml	py : fix requirements check '==' -> '~=' (#8982 )	2024-08-12 11:02:01 +03:00
python-lint.yml	ci : add ubuntu cuda build, build with one arch on windows (#10456 )	2024-11-26 13:05:07 +01:00
python-type-check.yml	ci : reduce severity of unused Pyright ignore comments (#9697 )	2024-09-30 14:13:16 -04:00
server.yml	tests : increase timeout when sanitizers are enabled (#11300 )	2025-01-19 20:22:30 +02:00