Commit Graph

  • 4daaa5e792
    Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +0200
  • 4ae12d0824
    Make mmap_file static Slaren 2023-03-29 06:18:18 +0200
  • a1e0f17a05
    Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +0200
  • 2a6cef62b3
    Add mmap support for model files Slaren 2023-03-29 02:03:43 +0200
  • dc5adf173a Windows: convert prompt in system locale to UTF-8. Allows to use others languages without tambourine dancing... Dmitriy Prikhodko 2023-03-30 04:22:45 +0500
  • 81b8748c98 blacked flake.lock Geeks-sid 2023-03-29 19:17:45 -0400
  • 172febff3f blacked quantize Geeks-sid 2023-03-29 19:16:48 -0400
  • 382c0c6100 blacked unversioned-ggml-to-ggml Geeks-sid 2023-03-29 19:16:32 -0400
  • efab7f8bad blacked gptq-to-ggml Geeks-sid 2023-03-29 19:16:04 -0400
  • eec105a86a blacked gpt4all-to-ggml Geeks-sid 2023-03-29 19:15:37 -0400
  • dfa2d707e9 apply black to ggml-to-pth Geeks-sid 2023-03-29 19:15:06 -0400
  • 842abc7da9
    Remove unused variable Casey Primozic 2023-03-29 14:11:29 -0700
  • 9cbc404ba6
    ci : re-enable AVX512 testing (Windows-MSVC) (#584) master-9cbc404 anzz1 2023-03-29 23:44:39 +0300
  • f789d8d50a Initial windows support (untested) Slaren 2023-03-29 22:22:36 +0200
  • b51c717d5c
    ggml : init time on first ggml_init() call master-b51c717 Georgi Gerganov 2023-03-29 22:15:34 +0300
  • 0ba76c1e73
    llama : fix compile warnings when reading the vocab master-0ba76c1 Georgi Gerganov 2023-03-29 22:13:12 +0300
  • cea1c85948
    ggml : add ARM_NEON dequantize_row_q4_1() master-cea1c85 Georgi Gerganov 2023-03-29 22:10:01 +0300
  • f202ada131
    ggml : add ARM_NEON quantize_row_q4_1() master-f202ada Georgi Gerganov 2023-03-29 22:03:02 +0300
  • 3b44d30d9b
    ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov 2023-03-29 21:47:33 +0300
  • 61cbfff5c9
    rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak 2023-03-29 20:09:25 +0200
  • d9ad104440
    Create chat-13B.bat (#592) Thérence 2023-03-29 19:21:09 +0200
  • 536971bade
    Apply suggestions from code review anzz1 2023-03-29 20:20:02 +0300
  • c447de6937
    rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py Pavol Rusnak 2023-03-29 19:06:19 +0200
  • d8febc8653 renamed main python script Concedo 2023-03-30 00:48:44 +0800
  • 664b277c27 integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed. Concedo 2023-03-30 00:43:52 +0800
  • b467702b87
    readme : fix typos Georgi Gerganov 2023-03-29 19:38:31 +0300
  • 516d88e75c
    readme : add GPT4All instructions (close #588) Georgi Gerganov 2023-03-29 19:37:20 +0300
  • 53635c081c
    py : add GPT4All conversion script Georgi Gerganov 2023-03-29 19:29:26 +0300
  • 41318d708e
    llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou 2023-03-29 18:10:07 +0200
  • a6956b25a1
    add example of re-act pattern (#583) Tobias Lütke 2023-03-29 17:10:24 +0200
  • 83df5639eb
    Fix GCC warning about binary literal (#595) master-83df563 anzz1 2023-03-29 16:20:07 +0300
  • a5c42c4b13
    Fix typo in llama.h (#593) master-a5c42c4 anzz1 2023-03-29 16:19:29 +0300
  • 49c4c225b5 Merge branch 'master' into concedo Concedo 2023-03-29 21:08:03 +0800
  • 271307232c Merged PR with a few changes: Concedo 2023-03-29 20:38:57 +0800
  • b7a3365f4a
    Fix GCC warning about binary literal anzz1 2023-03-29 15:28:26 +0300
  • 73071045c9
    Fix typo in llama.h anzz1 2023-03-29 14:22:50 +0300
  • b3a360d80c
    Create chat-13B.bat Thérence 2023-03-29 10:06:56 +0200
  • ff9b824a2a
    fixed whitespace in reverse prompt issue Tobias Lütke 2023-03-29 09:56:11 +0200
  • 3f5f4286dd Always initialize mm_addr and mm_length in llama_model Slaren 2023-03-29 08:53:14 +0200
  • baa529e9c0 Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +0200
  • e6f1c19937 Make mmap_file static Slaren 2023-03-29 06:18:18 +0200
  • 7961493a40 Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +0200
  • ef9afe1540 Add mmap support for model files Slaren 2023-03-29 02:03:43 +0200
  • 13b4c05d66 Some more code cleanup InconsolableCellist 2023-03-28 16:59:27 -0600
  • 1041ddb2cd apply PR suggestions Tristan Carel 2023-03-28 22:36:56 +0200
  • 08121f3aa8 parallelize the quantization process Tristan Carel 2023-03-28 17:46:43 +0200
  • 9a1ded757b
    plain __cpuid is enough here anzz1 2023-03-29 01:24:48 +0300
  • bb54708e40
    CI: Re-enable AVX512 testing (Windows-MSVC) anzz1 2023-03-29 01:01:06 +0300
  • c9c820ff36
    Added support for _POSIX_MAPPED_FILES if defined in source (#564) mmap CoderRC 2023-03-28 17:26:25 -0400
  • 88c6535377
    spelling... Tobias Lütke 2023-03-28 23:20:11 +0200
  • c6e8014062
    add example of re-act pattern Tobias Lütke 2023-03-28 23:06:33 +0200
  • 5a5f8b1501
    Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) master-5a5f8b1 anzz1 2023-03-28 22:44:29 +0300
  • f1217055ea
    CI: fix subdirectory path globbing (#546) master-f121705 anzz1 2023-03-28 22:43:25 +0300
  • 13addf2a78 Merge branch 'concedo' of github.com:InconsolableCellist/llamacpp-for-kobold into concedo InconsolableCellist 2023-03-28 13:43:19 -0600
  • f7c905b0d0 Minor overhaul of code: InconsolableCellist 2023-03-28 13:39:34 -0600
  • 003365907d updating to version 17 of embedded koboldAI, and adding host address support InconsolableCellist 2023-03-28 13:39:10 -0600
  • 8765be59a9
    Update build.yml anzz1 2023-03-28 22:07:08 +0300
  • 38bc9cef4e
    Merge branch 'mmap' into mmap CoderRC 2023-03-28 15:03:48 -0400
  • 7f4c5c6651
    llama : fix linkage with mingw (#551) master-7f4c5c6 anzz1 2023-03-28 21:23:09 +0300
  • 2a98bc18ea
    ggml : add AVX2 implementation of quantize_row_q4_1 (#515) master-2a98bc1 slaren 2023-03-28 20:06:03 +0200
  • 6ab328d88c
    Make quantize_row_q4_1 static slaren 2023-03-28 20:05:37 +0200
  • d0aaff571c
    py : add temporary script to convert old ggml files to newer version (#539) master-d0aaff5 thement 2023-03-28 19:55:42 +0200
  • d0330fd783
    py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) Tai Duc Nguyen 2023-03-28 13:51:29 -0400
  • 41669f67d8 Actually use AVX2 Slaren 2023-03-28 19:45:59 +0200
  • e29652996b Add AVX2 implementation of quantize_row_q4_1 Slaren 2023-03-28 19:41:28 +0200
  • 1a5ee11377
    Restore old -std= flags Justine Tunney 2023-03-28 10:36:25 -0700
  • 1631298475
    Remove -std=foo compiler flags Justine Tunney 2023-03-28 10:23:34 -0700
  • 99c5b27654
    ggml : refactor quantized processing functions (#509) master-99c5b27 Stephan Walter 2023-03-28 17:13:01 +0000
  • 1229722c61
    Merge branch 'master' into q-refactor Georgi Gerganov 2023-03-28 20:11:56 +0300
  • a0c2401359
    ggml : minor Georgi Gerganov 2023-03-28 20:10:14 +0300
  • cbddf4661b
    Get mmap() working with WIN32 MSVC Justine Tunney 2023-03-28 09:27:41 -0700
  • 692ce3164e
    py : removed unused model variable and verified that the code functions correctly with vocab_only setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) DooWoong Lee (David) 2023-03-29 02:02:34 +0900
  • 96f9c0506f
    ci : make ctest verbose, hopefully we see what is wrong with the sanitizer master-96f9c05 Georgi Gerganov 2023-03-28 20:01:09 +0300
  • d502bc7c9d
    tests : free llama context at the end of the test master-d502bc7 Georgi Gerganov 2023-03-28 19:51:55 +0300
  • 436e561931
    all : be more strict about converting float to double (#458) master-436e561 Stephan Walter 2023-03-28 16:48:20 +0000
  • 21e9ce7574
    perplexity : add <cmath> Georgi Gerganov 2023-03-28 19:40:01 +0300
  • 20e1e84884
    deploy : add a Package.swift for SwiftPM support (#393) master-20e1e84 Jed Fox 2023-03-28 11:39:01 -0500
  • 61733d3b49
    all : prefer float over double where appropriate Georgi Gerganov 2023-03-28 19:11:31 +0300
  • e4881686b4
    Make WIN32 mmap() improvements (#341) oKatanaaa 2023-03-21 01:46:44 +0400
  • f68345e9b1
    Fix softmax in perplexity.cpp Stephan Walter 2023-03-26 12:36:55 +0200
  • 3a42193b3d
    Test equivalence of round, SILU implementations Stephan Walter 2023-03-25 17:00:29 +0100
  • 54b75a77fb
    Be more strict about converting float to double Stephan Walter 2023-03-24 10:26:44 +0100
  • c1f885067c
    ggml : introduce structs for the q4 data blocks (#356) master-c1f8850 Stephan Walter 2023-03-28 15:56:03 +0000
  • 6a3b29a923
    ggml : rename quant struct variables + fix ARM_NEON Georgi Gerganov 2023-03-28 18:52:33 +0300
  • e0670260fb
    gitignore : add "embedding" Georgi Gerganov 2023-03-28 18:34:35 +0300
  • ce3f7adc85
    Fix linking on mingw32 anzz1 2023-03-28 18:14:04 +0300
  • 28ba975aea
    Check the existence of f16_model_path_base in quantize.py (#574) dotpy314 2023-03-28 23:06:28 +0800
  • 25248d7391 Use the same threshold for OpenBLAS and ggml thread limiting Maël Kerbiriou 2023-03-28 16:51:45 +0200
  • 2e6c295bc7
    CMake: Add explicit F16C option (x86) anzz1 2023-03-28 17:43:32 +0300
  • a6bdc47cba
    Fix usage of F16C intrinsics in AVX code (#563) master-a6bdc47 slaren 2023-03-28 16:26:55 +0200
  • 40c8e68122 Check the existence of f16_model_path_base in quantize.py Jincheng Miao 2023-03-28 22:13:16 +0800
  • 7b8dbcb78b
    main.cpp fixes, refactoring (#571) master-7b8dbcb anzz1 2023-03-28 17:09:55 +0300
  • 51266e4ae7
    n_keep help update anzz1 2023-03-28 16:54:29 +0300
  • ebf09a1919
    * -> & anzz1 2023-03-28 16:02:40 +0300
  • fcabe9b8b2
    found this one on the floor anzz1 2023-03-28 15:49:40 +0300
  • 021bdf237a
    main.cpp fixes, refactoring anzz1 2023-03-28 15:43:16 +0300
  • 911782cfdd Use more accurate function names Slaren 2023-03-28 14:29:09 +0200
  • 7c97743ea6 Fix linker error for tests kirillsurkov 2023-03-28 13:40:25 +0300
  • bf30406f50 Merge branch 'master' into concedo Concedo 2023-03-28 17:13:38 +0800
  • 99590bf992
    CI: github runner avx512f detection fix (windows) anzz1 2023-03-28 11:31:49 +0300