Commit Graph

  • 335cd8d947 Rename postfix to suffix to match upstream Mug 2023-05-06 13:18:25 +0200
  • 32cf0133c9 Update low level examples Mug 2023-05-04 18:33:08 +0200
  • 9e79465b21 Prefer explicit imports Andrei Betlen 2023-05-05 14:05:31 -0400
  • d15578e63e Update llama.cpp (session version) Andrei Betlen 2023-05-03 09:33:30 -0400
  • c26e9bf1c1 Update sampling api Andrei Betlen 2023-05-01 14:47:55 -0400
  • 78531e5d05 Fix return types and import comments Andrei Betlen 2023-05-01 14:02:06 -0400
  • d0031edbd2 Update llama.cpp Andrei Betlen 2023-05-01 10:44:28 -0400
  • 441d30811a Detect multi-byte responses and wait Mug 2023-04-28 12:50:30 +0200
  • 36b3494332 Also ignore errors on input prompts Mug 2023-04-26 14:45:51 +0200
  • c8e6ac366a Update llama.cpp (llama_load_session_file) Andrei Betlen 2023-04-28 15:32:43 -0400
  • 66ad132575 Update llama.cpp Andrei Betlen 2023-04-26 20:00:54 -0400
  • 656190750d Update llama.cpp Andrei Betlen 2023-04-25 19:03:41 -0400
  • 80c18cb665 Update llama.cpp (remove llama_get_kv_cache) Andrei Betlen 2023-04-24 09:30:10 -0400
  • bf9f02d8ee Update llama.cpp Andrei Betlen 2023-04-22 19:50:28 -0400
  • 5bbf40aa47 Update llama.cpp Andrei Betlen 2023-04-21 17:40:27 -0400
  • fd64310276 Fix decode errors permanently Mug 2023-04-26 14:37:06 +0200
  • bdbaf5dc76 Fixed end of text wrong type, and fix n_predict behaviour Mug 2023-04-17 14:45:28 +0200
  • 81c4c10389 Update type signature to allow for null pointer to be passed. Andrei Betlen 2023-04-18 23:44:46 -0400
  • 8229410a4e More reasonable defaults Mug 2023-04-10 16:38:45 +0200
  • b6ce5133d9 Add bindings for LoRA adapters. Closes #88 Andrei Betlen 2023-04-18 01:30:04 -0400
  • 3693449c07 Update llama.cpp Andrei Betlen 2023-04-12 14:29:00 -0400
  • d595f330e2 Update llama.cpp Andrei Betlen 2023-04-11 11:59:03 -0400
  • ce0ca60b56 Update llama.cpp (llama_mmap_supported) Andrei Betlen 2023-04-09 22:01:33 -0400
  • d0a7ce9abf Make windows users happy (hopefully) Mug 2023-04-10 17:12:25 +0200
  • 848b4021a3 Better custom library debugging Mug 2023-04-10 17:06:58 +0200
  • c8b5d0b963 Use environment variable for library override Mug 2023-04-10 17:00:35 +0200
  • d1b3517477 Allow local llama library usage Mug 2023-04-05 14:23:01 +0200
  • b36c04c99e Added iterative search to prevent instructions from being echoed, add ignore eos, add no-mmap, fixed 1 character echo too much bug Mug 2023-04-10 16:35:38 +0200
  • f25a81309e Update model paths to be more clear they should point to file Andrei Betlen 2023-04-09 22:45:55 -0400
  • e19909249d More interoperability to the original llama.cpp, and arguments now work Mug 2023-04-07 13:32:19 +0200
  • d5680144c5 Bugfix: Wrong size of embeddings. Closes #47 Andrei Betlen 2023-04-08 15:05:33 -0400
  • 29e9fb66a3 Better llama.cpp interoperability Mug 2023-04-06 15:30:57 +0200
  • ce66405da1 Add quantize example Andrei Betlen 2023-04-05 04:17:26 -0400
  • 739e8d4c9b Fix bug in init_break not being set when exited via antiprompt and others. Mug 2023-04-05 14:47:24 +0200
  • ae1f37f505 Fix repeating instructions and an antiprompt bug Mug 2023-04-04 17:54:47 +0200
  • 3c1020b866 Fix stripping instruction prompt Mug 2023-04-04 16:20:27 +0200
  • 0bfad75406 Added instruction mode, fixed infinite generation, and various other fixes Mug 2023-04-04 16:18:26 +0200
  • 9e872410da Add instruction mode Mug 2023-04-04 11:48:48 +0200
  • 15bea0946b Chat llama.cpp example implementation Mug 2023-04-03 22:54:46 +0200
  • 2b8147e7a8 Update llama_cpp.py MillionthOdin16 2023-04-02 21:50:13 -0400
  • 62ce167b22 Update low level api example Andrei Betlen 2023-04-01 13:02:10 -0400
  • a71cda6546 Update llama.cpp Andrei Betlen 2023-03-28 21:10:23 -0400
  • a279acd680 Update llama.cpp (llama_n_embd) Andrei Betlen 2023-03-25 16:26:03 -0400
  • ef3c152257 Update llama.cpp (llama_progress_callback) Andrei Betlen 2023-03-25 12:12:09 -0400
  • def46dd9a6 Add example based on stripped down version of main.cpp from llama.cpp Andrei Betlen 2023-03-24 18:57:25 -0400
  • 5bb1bc74d1 Fix type signature of token_to_str Andrei Betlen 2023-03-31 03:25:12 -0400
  • a7a6d88793 Fix ctypes typing issue for Arrays Andrei Betlen 2023-03-31 03:20:15 -0400
  • 019650f416 Fix array type signatures Andrei Betlen 2023-03-31 02:08:20 -0400
  • a3da39af79 Bugfix: cross-platform method to find shared lib Andrei Betlen 2023-03-24 18:43:29 -0400
  • bd1c657f80 Bugfix: wrong signature for quantize function Andrei Betlen 2023-04-04 22:36:59 -0400
  • ef5a9a6160 Update llama.cpp and re-organize low-level api Andrei Betlen 2023-03-24 14:58:42 -0400
  • d9dfdec2bd Initial commit (llama_cpp.py, llama-cpp-python) Andrei Betlen 2023-03-23 05:33:06 -0400
  • bed308c69c
    Apply suggestions from code review Henri Vasserman 2023-06-01 01:15:48 +0300
  • 8478e59b08
    Merge pull request #8 from SlyEcho/server_refactor Randall Fitzgerald 2023-05-31 18:03:40 -0400
  • 9104fe5a7c
    Change how the token buffers work. Henri Vasserman 2023-05-31 11:47:55 +0300
  • f2e1130901
    Merge pull request #7 from anon998/logging-reuse Randall Fitzgerald 2023-05-31 17:08:12 -0400
  • 497160a60d remove old log function anon 2023-05-31 18:01:07 -0300
  • 1bd7cc60a8 reuse format_generation_settings for logging anon 2023-05-31 17:58:43 -0300
  • 43d295fddc filter empty stopping strings anon 2023-05-31 16:54:12 -0300
  • 276fa99873 Misunderstood the instructions, I think. Back to the raw JSON output only. digiwombat 2023-05-31 16:45:57 -0400
  • 1b96df2b5f Spacing fix. Nothing to see here. digiwombat 2023-05-31 16:42:43 -0400
  • 86337e3a9b Server console logs now come in one flavor: Verbose. digiwombat 2023-05-31 16:41:34 -0400
  • dda4c10d64 Switch to the CPPHTTPLIB logger. Verbose adds body dump as well as request info. digiwombat 2023-05-31 16:23:39 -0400
  • 7ca81e9e65
    mtl : add reshape and transpose handling Georgi Gerganov 2023-05-31 22:38:40 +0300
  • 7332b41f9f Simple single-line server log for requests digiwombat 2023-05-31 15:56:27 -0400
  • 1213af76ce
    mtl : add rope kernel Georgi Gerganov 2023-05-31 22:28:59 +0300
  • 6af6a05663
    ggml : fix handling of "view" ops in ggml_graph_import() Georgi Gerganov 2023-05-31 22:28:15 +0300
  • 96fa480147
    Merge pull request #6 from anon998/fix-multibyte Randall Fitzgerald 2023-05-31 12:14:43 -0400
  • 234270bd83 back to 32 block size, not better Concedo 2023-06-01 00:14:22 +0800
  • 3edaf6bd8b print timings by default anon 2023-05-31 12:55:19 -0300
  • d58e48663d default penalize_nl to false + format anon 2023-05-31 11:56:12 -0300
  • 40e13805d9 print timings + build info anon 2023-05-31 10:41:47 -0300
  • dd30219332 buffer incomplete multi-byte characters anon 2023-05-31 10:40:42 -0300
  • 27911d6d68 fix default model alias anon 2023-05-31 10:37:52 -0300
  • aa2bbb2d35 fix parameter type anon 2023-05-31 10:36:51 -0300
  • f1710b90dc add infinite generation when n_predict is -1 anon 2023-05-31 10:35:25 -0300
  • 284bc293b1 reserve memory for generated_text anon 2023-05-31 10:46:06 -0300
  • 446e42a8c6 change dmmv block size Concedo 2023-05-31 21:40:12 +0800
  • 83a34444af
    remove trailing whitespace xaedes 2023-05-31 15:02:38 +0200
  • 01fc3faf71
    add explicit cast to fix compile error xaedes 2023-05-31 15:00:54 +0200
  • 2c08f29691 make api server use only a single thread anon 2023-05-31 09:02:32 -0300
  • c1cbde82a1 print error when server can't bind to the interface anon 2023-05-31 00:00:56 -0300
  • f88fb2bdc5
    add #include <climits> xaedes 2023-05-31 12:38:26 +0200
  • 077ee4e989 Revert "Revert "opencl : no need to allocate cl_mem on heap (#1612)"" Concedo 2023-05-31 18:00:52 +0800
  • 50c85bea4c Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental Concedo 2023-05-31 17:53:14 +0800
  • 32dada5e5f updated lite Concedo 2023-05-31 17:52:09 +0800
  • 5e1eecfe12 Adapt to #1612 cl_mem malloc changes 0cc4m 2023-05-31 07:07:47 +0200
  • 49aaf08387 Merge remote-tracking branch 'origin/master' into opencl-dev 0cc4m 2023-05-31 06:58:51 +0200
  • a5a85d68c6 Merge branch 'master' into concedo_experimental Concedo 2023-05-31 10:51:54 +0800
  • 85c9f7df41 Merge remote-tracking branch 'occam/opencl-dev' into concedo_experimental Concedo 2023-05-31 10:20:32 +0800
  • 4afa38e744 Revert "opencl : no need to allocate cl_mem on heap (#1612)" Concedo 2023-05-31 10:20:23 +0800
  • 9f2424ac47
    Merge pull request #5 from anon998/stop-stream Randall Fitzgerald 2023-05-30 22:16:32 -0400
  • 3a079d5cc8 stop generating when the stream is closed anon 2023-05-30 23:12:00 -0300
  • 7a8104fbd2 add missing quote when printing stopping strings anon 2023-05-30 23:11:32 -0300
  • b6f536dfb3 Cull to end of generated_text when encountering a stopping string in case it's a partial token. digiwombat 2023-05-30 21:14:24 -0400
  • 9197674a6b
    Merge pull request #4 from anon998/logging Randall Fitzgerald 2023-05-30 20:58:18 -0400
  • aa0788b650 add --verbose flag and request logging anon 2023-05-30 21:41:55 -0300
  • 7a853dc56d prevent the server from swallowing exceptions in debug mode anon 2023-05-30 21:39:30 -0300
  • e6de69abfb
    Merge pull request #3 from anon998/sse Randall Fitzgerald 2023-05-30 20:36:52 -0400
  • 2533878b79
    Merge branch 'master' into sse Randall Fitzgerald 2023-05-30 20:34:48 -0400