Commit Graph

10 Commits

Author SHA1 Message Date
Thiago Padilha
3a0dcb3920
Implement server mode.
This new mode works by first loading the model then listening for TCP
connections on a port. When a connection is received, arguments will be
parsed using a simple protocol:

- First the number of arguments will be read followed by a newline
  character.
- Then each argument will be read, separated by the 0 byte.
- With this we build an argument vector, similar to what is passed to
  the program entry point. We pass this to gpt_params_parse.

Finally `run` will be executed with the input/output streams connected
to the socket.

Signed-off-by: Thiago Padilha <thiago@padilha.cc>
2023-03-22 14:34:19 -03:00
Thiago Padilha
d7d53b84db
Add main.cpp back and invoke "run" from it
Signed-off-by: Thiago Padilha <thiago@padilha.cc>
2023-03-22 14:31:41 -03:00
Thiago Padilha
90175ee13f
Move main.cpp to run.cpp
Signed-off-by: Thiago Padilha <thiago@padilha.cc>
2023-03-22 14:31:35 -03:00
Erik Scholz
4122dffff9
cmake: make llama an actual library (#392) 2023-03-22 18:37:10 +02:00
Georgi Gerganov
f5a77a629b
Introduce C-style API (#370)
* Major refactoring - introduce C-style API

* Clean up

* Add <cassert>

* Add <iterator>

* Add <algorithm> ....

* Fix timing reporting and accumulation

* Measure eval time only for single-token calls

* Change llama_tokenize return meaning
2023-03-22 07:32:36 +02:00
Georgi Gerganov
eb34620aec
Add tokenizer test + revert to C++11 (#355)
* Add test-tokenizer-0 to do a few tokenizations - feel free to expand
* Added option to convert-pth-to-ggml.py script to dump just the vocabulary
* Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)
* Added utility to load vocabulary file from previous point (temporary implementation)
* Avoid using std::string_view and drop back to C++11 (hope I didn't break something)
* Rename gpt_vocab -> llama_vocab
* All CMake binaries go into ./bin/ now
2023-03-21 17:29:41 +02:00
nusu-github
8cf9f34edd
Adding missing features of CMakeLists.txt & Refactoring (#131)
* Functionality addition CMakeLists.txt

Refactoring:
1. Simplify more options that are negation of negation.
LLAMA_NO_ACCELERATE -> LLAMA_ACCELERATE
2. Changed to an optional expression instead of forcing to enable AVX2 in MSVC.
3. Make CMAKE_CXX_STANDARD, which is different from Makefile, the same.
4. Use add_compile_options instead of adding options to CMAKE_C_FLAGS.
5. Make utils use target_link_libraries instead of directly referencing code.

Added features:
1. Added some options.
LLAMA_STATIC_LINK,LLAMA_NATIVE,LLAMA_LTO,LLAMA_GPROF,LLAMA_OPENBLAS

* Fix Accelerate link in CMake

* Windows build Fix

* C++11 to C++17

* Reflects C/C++ standard individually

* Change the version to 3.12

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-21 01:37:16 +01:00
mmyjona
6b0df5ccf3
add ptread link to fix cmake build under linux (#114)
* add ptread link to fix cmake build under linux

* add cmake to linux and macos platform

* separate make and cmake workflow

---------

Co-authored-by: Sebastián A <sebastian.aedo29@gmail.com>
2023-03-17 13:38:24 -03:00
Georgi Gerganov
c09a9cfb06
CMake build in Release by default (#75) 2023-03-13 21:22:15 +02:00
Sebastián A
ed6849cc07
Initial support for CMake (#75) 2023-03-13 19:12:33 +02:00