Commit Graph

  • 4e4cfdfb67
    tokenize and inject reverse prompt as needed rabidcopy 2023-03-22 17:46:23 -0500
  • 69071d3b6b Squeeze out about 5% more performance in Q4_1 inference Matvey Soloviev 2023-03-21 22:55:35 +0100
  • ce339001c4 Fix instruct mode broken by PR #354 Johnman 2023-03-22 22:23:14 +0100
  • ae1519f681
    Update tools.sh RSereno 2023-03-22 20:29:20 +0000
  • 9ea43d4d91 Add support to batch size for perplexity Gary Linscott 2023-03-22 12:09:42 -0700
  • ee8a788786
    Update issue template so people will use it (#404) Gary Mulder 2023-03-22 19:06:18 +0000
  • a6bd606cd0
    typo Stephan Walter 2023-03-22 19:02:39 +0000
  • 49197bbd6b
    Update custom.md Gary Mulder 2023-03-22 18:06:15 +0000
  • 84ba1fd25b add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning Tai Duc Nguyen 2023-03-22 13:38:39 -0400
  • 3a0dcb3920
    Implement server mode. tcp_server Thiago Padilha 2023-03-22 10:41:26 -0300
  • bf44faa0ee
    Remove direct access to std streams from "run" Thiago Padilha 2023-03-22 09:55:45 -0300
  • b7f1fa6d8c
    Move llama_context setup + perplexity back to main.cpp Thiago Padilha 2023-03-22 09:39:25 -0300
  • d7d53b84db
    Add main.cpp back and invoke "run" from it Thiago Padilha 2023-03-22 09:16:33 -0300
  • 90175ee13f
    Move main.cpp to run.cpp Thiago Padilha 2023-03-22 09:05:50 -0300
  • 69c92298a9
    Deduplicate q4 quantization functions (#383) master-69c9229 Stephan Walter 2023-03-22 17:29:06 +0000
  • 97940520e8
    fix: add POSIX functionality for Linux compilation (#51) master-9794052 master-305ba6f Valentyn Bezshapkin 2023-03-22 18:20:25 +0100
  • 305ba6f0e6
    Don't force immediate interactive without -i (#354) tjohnman 2023-03-22 18:16:35 +0100
  • b29e6f318a Disable AVX2 flags in CI Georgi Gerganov 2023-03-22 19:08:14 +0200
  • 992ebff68b
    Re-enable quantization test Georgi Gerganov 2023-03-22 18:58:16 +0200
  • e590787ab3
    Update main.cpp rabidcopy 2023-03-22 11:43:57 -0500
  • 879da33ab4
    Update main.cpp rabidcopy 2023-03-22 11:41:19 -0500
  • 4122dffff9
    cmake: make llama an actual library (#392) master-4122dff Erik Scholz 2023-03-22 17:37:10 +0100
  • c4efdb22af
    tokenize nothing for antiprompt if no reverse rabidcopy 2023-03-22 11:22:56 -0500
  • 56e659a0b2
    fix perplexity after c-api refactor (#390) master-56e659a Erik Scholz 2023-03-22 17:09:38 +0100
  • da0837f55f
    tokenize/inject reverse prompt for refactor rabidcopy 2023-03-22 11:01:47 -0500
  • 40ea807a97
    Add details on perplexity to README.md (#395) Gary Linscott 2023-03-22 08:53:54 -0700
  • c65eff0d14
    Add details on dataset/context length Gary Linscott 2023-03-22 08:48:36 -0700
  • 9d9e152b6d
    Add details on perplexity to README.md Gary Linscott 2023-03-22 08:19:17 -0700
  • c5c1c8d5ce
    Update README.md LostRuins 2023-03-22 22:54:27 +0800
  • 4ff58f73e5 Merge branch 'master' into concedo Concedo 2023-03-22 22:32:11 +0800
  • 86c7457e24 Merge branch 'master' into concedo Concedo 2023-03-22 22:31:45 +0800
  • 7b77319054
    cmake: make llama an actual library Green Sky 2023-03-22 14:46:29 +0100
  • 3501b9df50 Use const; add basic test Stephan Walter 2023-03-22 13:42:03 +0100
  • 57fee166d2
    don't create a new std::string (especially here, where it's usually large) Green Sky 2023-03-22 12:58:20 +0100
  • 7b1b575fe8
    preallocate a buffer of fitting size for tokenization (utils.cpp) Green Sky 2023-03-22 12:56:42 +0100
  • 827bcb1375
    fix perplexity after c-api refactor by proving a large enough token buffer Green Sky 2023-03-22 12:44:26 +0100
  • d5850c53ca
    Add missing header for memcpy (#386) master-d5850c5 Yusuf Kağan Hanoğlu 2023-03-22 11:55:45 +0300
  • 99acb7a352
    Added missing include for memcpy to llama.cpp niansa/tuxifan 2023-03-22 09:50:58 +0100
  • 23e75fbae9
    Build Error Fixed Yusuf Kağan Hanoğlu 2023-03-22 11:43:55 +0300
  • 5c475503ce resize image Concedo 2023-03-22 16:21:40 +0800
  • 4e95e7f87f Updated readme Concedo 2023-03-22 16:20:37 +0800
  • 5f142df76e dynamic max context size defaulting to 1024, also implemented the basic API as a fallback Concedo 2023-03-22 15:56:47 +0800
  • b4dfdf7a77 Deduplicate q4 quantization functions Stephan Walter 2023-03-22 08:30:11 +0100
  • 23bb78fbdc
    add newline token rabidcopy 2023-03-22 01:55:57 -0500
  • 1752bc92eb
    add newline token rabidcopy 2023-03-22 01:55:17 -0500
  • 6fb0db31d7
    Merge branch 'master' into interactive-eos-fix rabidcopy 2023-03-22 01:54:07 -0500
  • 84130caf5e merge with main, move logic for embeddings into llama.cpp strikingLoo 2023-03-21 23:44:04 -0700
  • 78cff58427 make params argument instead of hardcoded boolean. remove useless time check strikingLoo 2023-03-21 23:21:07 -0700
  • ae44e23ee3
    When seed <= 0 - use the clock to generate one master-ae44e23 master-928480e Georgi Gerganov 2023-03-22 07:47:15 +0200
  • 928480ef5b
    Init llama_context_params properly from CLI (#370) Georgi Gerganov 2023-03-22 07:45:00 +0200
  • 56817b1f88
    Remove temporary notice and update hot topics master-f5a77a6 Georgi Gerganov 2023-03-22 07:34:02 +0200
  • f5a77a629b
    Introduce C-style API (#370) Georgi Gerganov 2023-03-22 07:32:36 +0200
  • c3d13eaa4d
    Change llama_tokenize return meaning Georgi Gerganov 2023-03-22 07:27:26 +0200
  • a9f900b645
    Measure eval time only for single-token calls Georgi Gerganov 2023-03-22 07:22:51 +0200
  • 71ed3d224d
    Fix timing reporting and accumulation Georgi Gerganov 2023-03-22 07:17:42 +0200
  • 31c1646441
    CI fix Windows, make sure build passes before running tests anzz1 2023-03-22 04:40:14 +0200
  • e4524da1a1 merge + working embeddings strikingLoo 2023-03-21 19:08:21 -0700
  • 76dde26844 Working! Thanks to @nullhook strikingLoo 2023-03-21 18:32:51 -0700
  • c5ae5d08a5
    Added support for 30B weight. (#108) Trevor White 2023-03-21 18:34:01 -0400
  • da0e9fe90c Add SHA256SUMS file and instructions to README how to obtain and verify the downloads Gary Mulder 2023-03-20 20:14:06 +0000
  • 3d9c40459c
    Add SHA256SUMS file and instructions to README how to obtain and verify the downloads Gary Mulder 2023-03-20 20:14:06 +0000
  • e6c9e0986c Fix bin dir for win ci master-e6c9e09 anzz1 2023-03-21 23:49:24 +0200
  • 46952e8629
    Fix bin dir for win ci anzz1 2023-03-21 23:49:24 +0200
  • 81bd894c51
    Update chat.cpp Kevin Kwok 2023-03-21 14:37:49 -0700
  • 01a297b099
    specify build type for ctest on windows (#371) master-01a297b Erik Scholz 2023-03-21 22:34:25 +0100
  • 5dc847615f
    specify build type for ctest on windows Green Sky 2023-03-21 22:22:22 +0100
  • 285ca17ecb
    Update README.md Kevin Kwok 2023-03-21 14:15:19 -0700
  • 3366853e41
    Add notice about pending change Georgi Gerganov 2023-03-21 22:57:35 +0200
  • 9116ae9b53
    Change argument processing to allow prompt or file args. (#103) Tindell Lockett 2023-03-21 16:55:56 -0400
  • 428aa7025a
    Add support for 30B model and 65B, if it is made in the future (#104) Pi 2023-03-21 13:55:24 -0700
  • 3f9c6135e4
    fix typo in chatLLaMa (#368) Mathieu Nayrolles 2023-03-21 16:52:27 -0400
  • 19178fa28e 2048 context all core Henk 2023-03-21 21:49:47 +0100
  • 4d2e035347
    Add <algorithm> .... Georgi Gerganov 2023-03-21 22:46:44 +0200
  • cae6e8a002
    Add <iterator> Georgi Gerganov 2023-03-21 22:45:22 +0200
  • 90d07b52b0
    Add <cassert> Georgi Gerganov 2023-03-21 22:44:03 +0200
  • f9d4a0edcb
    Clean up Georgi Gerganov 2023-03-21 22:42:53 +0200
  • 9af8f79756
    Major refactoring - introduce C-style API Georgi Gerganov 2023-03-21 21:42:08 +0200
  • 7e36d4df61 fix typo in chatLLaMa Mathieu Nayrolles 2023-03-21 16:18:35 -0400
  • 6bcbe50792
    Merge branch 'master' into interactive-eos-fix rabidcopy 2023-03-21 14:23:16 -0500
  • 52f46ef78a
    tokenize first reverse prompt once rabidcopy 2023-03-21 14:10:20 -0500
  • e33df8e1a0
    tokenize and inject only first reverse prompt rabidcopy 2023-03-21 13:37:36 -0500
  • 7412c4871c Merge branch 'master' into stop-keywords Joshua Williams 2023-03-21 13:08:53 -0500
  • 3c211c64bd
    tokenize reverse prompt when needed rabidcopy 2023-03-21 12:53:32 -0500
  • 0f61352708
    Update issue templates Georgi Gerganov 2023-03-21 19:47:27 +0200
  • ea367074f8 Help text for stop keywords Joshua Williams 2023-03-21 12:39:15 -0500
  • 3eed8c0914 Initial implementation of stop keywords Joshua Williams 2023-03-21 12:26:36 -0500
  • 98570dd4f1 Update help output. Johnman 2023-03-21 18:24:59 +0100
  • 353ec251a4
    We could use std::unordered_map over std::map (#305) Fabio R. Sluzala 2023-03-21 14:21:50 -0300
  • fe854daf6d Don't force immediate interactive without -i Johnman 2023-03-21 18:21:50 +0100
  • cfdf363a0c
    Resolved recent conflicts with master Fabio Rossini Sluzala 2023-03-21 14:12:43 -0300
  • 6c4a22ad1e
    typo (missing dot) Michael Christen 2023-03-21 18:12:31 +0100
  • 89d5d90f3b
    Fix color codes emitting mid-UTF8 code. (#312) Matvey Soloviev 2023-03-21 18:11:01 +0100
  • f2451d1564
    Merge branch 'master' into fix-color-utf8 Matvey Soloviev 2023-03-21 17:45:01 +0100
  • 16ffc013c6
    Importer for GPTQ quantized LLaMA models (#301) comex 2023-03-21 09:42:25 -0700
  • 793fc301c9
    Merge branch 'master' into gptq Georgi Gerganov 2023-03-21 18:41:12 +0200
  • 1f4abb8dae
    Merge pull request #2 from slaren/interactive-eos-fix rabidcopy 2023-03-21 11:34:30 -0500
  • 486ae645fd
    Compute perplexity over prompt (#270) Gary Linscott 2023-03-21 09:27:42 -0700
  • 3ab3e6582f
    Add chatLLaMa script (#198) Jean-Christophe Hoelt 2023-03-21 18:23:15 +0200
  • f157088cb7
    makefile: Fix CPU feature detection on Haiku (#218) Alex von Gluck IV 2023-03-21 11:21:06 -0500
  • 9d1cdb8938 Merge remote-tracking branch 'origin/master' into perplexity Gary Linscott 2023-03-21 09:18:25 -0700