llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-26 12:21:40 +01:00

History

Georgi Gerganov 77a73403ca ggml : add new Q4_2 quantization (ARM only) (#1046 ) * ggml : Q4_2 ARM * ggml : add ggml_is_quantized() * llama : update llama_type_name() with Q4_2 entry * ggml : speed-up q4_2 - 4 threads: ~100ms -> ~90ms - 8 threads: ~55ms -> ~50ms * ggml : optimize q4_2 using vmlaq_n_f32 + vmulq_n_f32		2023-04-18 23:54:57 +03:00
..
benchmark	benchmark : fix result validation in benchmark-q4_0-matmult (#987 )	2023-04-15 08:51:54 +03:00
embedding	examples: add missing <ctime> include for time() (#1011 )	2023-04-16 10:13:00 +00:00
main	Add LoRA support (#820 )	2023-04-17 17:28:55 +02:00
perplexity	Add LoRA support (#820 )	2023-04-17 17:28:55 +02:00
quantize	ggml : add new Q4_2 quantization (ARM only) (#1046 )	2023-04-18 23:54:57 +03:00
quantize-stats	quantize-stats : fix bug in --type argument	2023-04-17 17:31:06 +03:00
alpaca.sh	examples : add -n to alpaca and gpt4all scripts (#706 )	2023-04-13 16:03:39 +03:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	Move chat scripts into "./examples"	2023-03-25 20:37:09 +02:00
chat.sh	If n_predict == -1, generate forever	2023-03-25 21:51:41 +02:00
CMakeLists.txt	Add quantize-stats command for testing quantization (#728 )	2023-04-08 00:09:18 +02:00
common.cpp	Add LoRA support (#820 )	2023-04-17 17:28:55 +02:00
common.h	Add LoRA support (#820 )	2023-04-17 17:28:55 +02:00
gpt4all.sh	examples : add -n to alpaca and gpt4all scripts (#706 )	2023-04-13 16:03:39 +03:00
Miku.sh	Fix whitespace, add .editorconfig, add GitHub workflow (#883 )	2023-04-11 19:45:44 +00:00
reason-act.sh	add example of re-act pattern (#583 )	2023-03-29 10:10:24 -05:00