llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2025-01-06 02:48:57 +01:00

History

Georgi Gerganov 758ff1bbb5 llama : refactor model loading code (#2620 ) * llama : style formatting + remove helper methods * llama : fix quantization using gguf tool * llama : simplify gguf_file_saver * llama : fix method names * llama : simplify write_header() * llama : no need to pass full file loader to the file saver just gguf_ctx * llama : gguf_file_saver write I32 * llama : refactor tensor names (#2622) * gguf: update tensor names searched in quantization * gguf : define tensor names as constants * gguf : initial write API (not tested yet) * gguf : write to file API (not tested) * gguf : initial write API ready + example * gguf : fix header write * gguf : fixes + simplify example + add ggml_nbytes_pad() * gguf : minor * llama : replace gguf_file_saver with new gguf write API * gguf : streaming support when writing files * gguf : remove oboslete write methods * gguf : remove obosolete gguf_get_arr_xxx API * llama : simplify gguf_file_loader * llama : move hparams and vocab from gguf_file_loader to llama_model_loader * llama : merge gguf-util.h in llama.cpp * llama : reorder definitions in .cpp to match .h * llama : minor simplifications * llama : refactor llama_model_loader (WIP) wip : remove ggml_ctx from llama_model_loader wip : merge gguf_file_loader in llama_model_loader * llama : fix shape prints * llama : fix Windows build + fix norm_rms_eps key * llama : throw error on missing KV paris in model meta data * llama : improve printing + log meta data * llama : switch print order of meta data --------- Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>		2023-08-16 14:34:03 +03:00
..
baby-llama	Add LLAMA_DEFAULT_RMS_EPS so we can change the default (#2384 )	2023-07-25 18:35:53 +03:00
benchmark	cmake : install targets (#2256 )	2023-07-19 10:01:11 +03:00
convert-llama2c-to-ggml	Adding support for llama2.c models (#2559 )	2023-08-12 01:17:25 +02:00
embd-input	build : fix several cast and printf warnings (#2499 )	2023-08-04 13:07:21 +03:00
embedding	llama : tokenizer fixes (#2549 )	2023-08-14 19:30:28 +03:00
gguf	llama : refactor model loading code (#2620 )	2023-08-16 14:34:03 +03:00
jeopardy	hooks : setting up flake8 and pre-commit hooks (#1681 )	2023-06-17 13:32:48 +03:00
main	convert : update convert-new.py with tokenizer fixes (#2614 )	2023-08-14 20:20:04 +03:00
metal	cmake : install targets (#2256 )	2023-07-19 10:01:11 +03:00
perplexity	build : fix several cast and printf warnings (#2499 )	2023-08-04 13:07:21 +03:00
quantize	cmake : install targets (#2256 )	2023-07-19 10:01:11 +03:00
quantize-stats	llama : tokenizer fixes (#2549 )	2023-08-14 19:30:28 +03:00
save-load-state	llama : tokenizer fixes (#2549 )	2023-08-14 19:30:28 +03:00
server	server : add --numa support (#2524 )	2023-08-14 16:36:42 +03:00
simple	llama : tokenizer fixes (#2549 )	2023-08-14 19:30:28 +03:00
train-text-from-scratch	llama : tokenizer fixes (#2549 )	2023-08-14 19:30:28 +03:00
alpaca.sh	alpaca.sh : update model file name (#2074 )	2023-07-06 19:17:50 +03:00
chat-13B.bat	Create chat-13B.bat (#592 )	2023-03-29 20:21:09 +03:00
chat-13B.sh	examples : read chat prompts from a template file (#1196 )	2023-05-03 20:58:11 +03:00
chat-persistent.sh	chat-persistent.sh : use bracket expressions in grep (#1564 )	2023-05-24 09:16:22 +03:00
chat-vicuna.sh	examples : add chat-vicuna.sh (#1854 )	2023-06-15 21:05:53 +03:00
chat.sh	If n_predict == -1, generate forever	2023-03-25 21:51:41 +02:00
CMakeLists.txt	Adding support for llama2.c models (#2559 )	2023-08-12 01:17:25 +02:00
common.cpp	llama : tokenizer fixes (#2549 )	2023-08-14 19:30:28 +03:00
common.h	llama : tokenizer fixes (#2549 )	2023-08-14 19:30:28 +03:00
console.cpp	Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.	2023-08-10 13:11:36 -07:00
console.h	Add --simple-io option for subprocesses and break out console.h and cpp (#1558 )	2023-08-04 08:20:12 -07:00
gpt4all.sh	examples : add -n to alpaca and gpt4all scripts (#706 )	2023-04-13 16:03:39 +03:00
grammar-parser.cpp	build : fix several cast and printf warnings (#2499 )	2023-08-04 13:07:21 +03:00
grammar-parser.h	llama : add grammar-based sampling (#1773 )	2023-07-23 23:58:10 -04:00
json-schema-to-grammar.py	examples : generate JSON according to schema (#1887 )	2023-08-02 22:05:44 -04:00
llama2-13b.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llama2.sh	gitignore : changes for Poetry users + chat examples (#2284 )	2023-07-21 13:53:27 +03:00
llama.vim	vim : streaming and more (#2495 )	2023-08-08 14:44:48 +03:00
llm.vim	llm.vim : multiline autocompletion, get rid of "^@" (#2543 )	2023-08-08 15:07:02 +03:00
make-ggml.py	examples : add easy python script to create quantized (k-bit support) GGML models from local HF Transformer models (#2311 )	2023-07-21 22:01:10 +03:00
Miku.sh	MIKU MAYHEM: Upgrading the Default Model for Maximum Fun 🎉 (#2287 )	2023-07-21 11:13:18 +03:00
reason-act.sh	add example of re-act pattern (#583 )	2023-03-29 10:10:24 -05:00
server-llama2-13B.sh	examples : fix whitespace	2023-07-28 21:05:08 +03:00