llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git synced 2024-12-31 00:10:52 +01:00

Author	SHA1	Message	Date
Thiago Padilha	3a0dcb3920	Implement server mode. This new mode works by first loading the model then listening for TCP connections on a port. When a connection is received, arguments will be parsed using a simple protocol: - First the number of arguments will be read followed by a newline character. - Then each argument will be read, separated by the 0 byte. - With this we build an argument vector, similar to what is passed to the program entry point. We pass this to gpt_params_parse. Finally `run` will be executed with the input/output streams connected to the socket. Signed-off-by: Thiago Padilha <thiago@padilha.cc>	2023-03-22 14:34:19 -03:00
Thiago Padilha	d7d53b84db	Add main.cpp back and invoke "run" from it Signed-off-by: Thiago Padilha <thiago@padilha.cc>	2023-03-22 14:31:41 -03:00
Thiago Padilha	90175ee13f	Move main.cpp to run.cpp Signed-off-by: Thiago Padilha <thiago@padilha.cc>	2023-03-22 14:31:35 -03:00
Erik Scholz	4122dffff9	cmake: make llama an actual library (#392 )	2023-03-22 18:37:10 +02:00
Georgi Gerganov	f5a77a629b	Introduce C-style API (#370 ) * Major refactoring - introduce C-style API * Clean up * Add <cassert> * Add <iterator> * Add <algorithm> .... * Fix timing reporting and accumulation * Measure eval time only for single-token calls * Change llama_tokenize return meaning	2023-03-22 07:32:36 +02:00
Georgi Gerganov	eb34620aec	Add tokenizer test + revert to C++11 (#355 ) * Add test-tokenizer-0 to do a few tokenizations - feel free to expand * Added option to convert-pth-to-ggml.py script to dump just the vocabulary * Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests) * Added utility to load vocabulary file from previous point (temporary implementation) * Avoid using std::string_view and drop back to C++11 (hope I didn't break something) * Rename gpt_vocab -> llama_vocab * All CMake binaries go into ./bin/ now	2023-03-21 17:29:41 +02:00
nusu-github	8cf9f34edd	Adding missing features of CMakeLists.txt & Refactoring (#131 ) * Functionality addition CMakeLists.txt Refactoring: 1. Simplify more options that are negation of negation. LLAMA_NO_ACCELERATE -> LLAMA_ACCELERATE 2. Changed to an optional expression instead of forcing to enable AVX2 in MSVC. 3. Make CMAKE_CXX_STANDARD, which is different from Makefile, the same. 4. Use add_compile_options instead of adding options to CMAKE_C_FLAGS. 5. Make utils use target_link_libraries instead of directly referencing code. Added features: 1. Added some options. LLAMA_STATIC_LINK,LLAMA_NATIVE,LLAMA_LTO,LLAMA_GPROF,LLAMA_OPENBLAS * Fix Accelerate link in CMake * Windows build Fix * C++11 to C++17 * Reflects C/C++ standard individually * Change the version to 3.12 --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-03-21 01:37:16 +01:00
mmyjona	6b0df5ccf3	add ptread link to fix cmake build under linux (#114 ) * add ptread link to fix cmake build under linux * add cmake to linux and macos platform * separate make and cmake workflow --------- Co-authored-by: Sebastián A <sebastian.aedo29@gmail.com>	2023-03-17 13:38:24 -03:00
Georgi Gerganov	c09a9cfb06	CMake build in Release by default (#75 )	2023-03-13 21:22:15 +02:00
Sebastián A	ed6849cc07	Initial support for CMake (#75 )	2023-03-13 19:12:33 +02:00

10 Commits