Commit Graph

  • c03ae8dca1 Add mmap support for model files Slaren 2023-03-29 02:03:43 +02:00
  • 516474b465
    Introduce GGML migration tool for new file format Justine Tunney 2023-03-30 05:42:56 -07:00
  • 85e8395944
    SWAP info added Jaime R 2023-03-30 21:25:21 +03:00
  • 3bcc129ba8
    cmake : properly invoke CTest (#629) master-3bcc129 Stephan Walter 2023-03-30 17:56:59 +00:00
  • a4755cf288
    Remove unused variable (#607) master-a4755cf Casey Primozic 2023-03-30 10:53:35 -07:00
  • 1f0414feec
    make : fix darwin f16c flags check (#615) master-1f0414f david raistrick 2023-03-30 13:34:45 -04:00
  • 77efdf5a50
    ggml : fix NEON signs (close #620, #622) master-77efdf5 Georgi Gerganov 2023-03-30 20:27:32 +03:00
  • 44aea7752b Properly invoke CTest Stephan Walter 2023-03-30 19:06:54 +02:00
  • 80dad7923e ggml : refactor AVX part of ggml_vec_dot_q4_0() Sergey Pershukov 2023-03-30 21:44:34 +05:00
  • 93c8dcae75 Update README.md saharNooby 2023-03-30 20:37:09 +04:00
  • 56bf4fc856 Implement time mixing, fix matrix shape mismatch saharNooby 2023-03-30 20:29:41 +04:00
  • 873cb954d0 Make ln0 work correctly saharNooby 2023-03-30 20:01:26 +04:00
  • 9bbf2180a8
    Merge branch 'ggerganov:master' into black Siddhesh Thakur 2023-03-30 10:15:42 -04:00
  • 2f51451561 Initial commit saharNooby 2023-03-30 17:55:30 +04:00
  • ed3c680bcd
    Fix GGML_F32Cx8_STORE in AVX without F16C path (#619) master-ed3c680 slaren 2023-03-30 11:16:30 +02:00
  • b223ef18b8 Fix GGML_F32Cx8_STORE in AVX without F16C path Slaren 2023-03-30 10:57:53 +02:00
  • a45e843efa
    Ensure --mlock works properly with mmap() support Justine Tunney 2023-03-30 01:53:36 -07:00
  • 75d1e55134
    Make loading weights 10-100x faster Justine Tunney 2023-03-29 13:51:37 -07:00
  • 93a3169284 ggml : add AVX ggml_vec_dot_q4_0() Sergey Pershukov 2023-03-30 09:50:40 +05:00
  • 79e14129e1 ggml : add AVX quantize_row_q4_0() Sergey Pershukov 2023-03-30 09:43:27 +05:00
  • 354d4f232f fixed linux openblas build errors Concedo 2023-03-30 11:55:35 +08:00
  • 977a9a246f Merge remote-tracking branch 'origin/master' into concedo Concedo 2023-03-30 09:42:51 +08:00
  • 0f5b470c04 more library checks Concedo 2023-03-30 09:28:04 +08:00
  • 4bab8cb243
    fix darwin f16c flags check david raistrick 2023-03-29 20:31:18 -04:00
  • 80c2178d04
    Initial windows support (untested) Slaren 2023-03-29 22:22:36 +02:00
  • 812cfa1995
    Always initialize mm_addr and mm_length in llama_model Slaren 2023-03-29 08:53:14 +02:00
  • 4daaa5e792
    Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +02:00
  • 4ae12d0824
    Make mmap_file static Slaren 2023-03-29 06:18:18 +02:00
  • a1e0f17a05
    Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +02:00
  • 2a6cef62b3
    Add mmap support for model files Slaren 2023-03-29 02:03:43 +02:00
  • dc5adf173a Windows: convert prompt in system locale to UTF-8. Allows to use others languages without tambourine dancing... Dmitriy Prikhodko 2023-03-30 04:22:45 +05:00
  • 81b8748c98 blacked flake.lock Geeks-sid 2023-03-29 19:17:45 -04:00
  • 172febff3f blacked quantize Geeks-sid 2023-03-29 19:16:48 -04:00
  • 382c0c6100 blacked unversioned-ggml-to-ggml Geeks-sid 2023-03-29 19:16:32 -04:00
  • efab7f8bad blacked gptq-to-ggml Geeks-sid 2023-03-29 19:16:04 -04:00
  • eec105a86a blacked gpt4all-to-ggml Geeks-sid 2023-03-29 19:15:37 -04:00
  • dfa2d707e9 apply black to ggml-to-pth Geeks-sid 2023-03-29 19:15:06 -04:00
  • 842abc7da9
    Remove unused variable Casey Primozic 2023-03-29 14:11:29 -07:00
  • 9cbc404ba6
    ci : re-enable AVX512 testing (Windows-MSVC) (#584) master-9cbc404 anzz1 2023-03-29 23:44:39 +03:00
  • f789d8d50a Initial windows support (untested) Slaren 2023-03-29 22:22:36 +02:00
  • b51c717d5c
    ggml : init time on first ggml_init() call master-b51c717 Georgi Gerganov 2023-03-29 22:15:34 +03:00
  • 0ba76c1e73
    llama : fix compile warnings when reading the vocab master-0ba76c1 Georgi Gerganov 2023-03-29 22:13:12 +03:00
  • cea1c85948
    ggml : add ARM_NEON dequantize_row_q4_1() master-cea1c85 Georgi Gerganov 2023-03-29 22:10:01 +03:00
  • f202ada131
    ggml : add ARM_NEON quantize_row_q4_1() master-f202ada Georgi Gerganov 2023-03-29 22:03:02 +03:00
  • 3b44d30d9b
    ggml : add ARM_NEON ggml_vec_dot_q4_1() Georgi Gerganov 2023-03-29 21:47:33 +03:00
  • 61cbfff5c9
    rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py (#600) Pavol Rusnak 2023-03-29 20:09:25 +02:00
  • d9ad104440
    Create chat-13B.bat (#592) Thérence 2023-03-29 19:21:09 +02:00
  • 536971bade
    Apply suggestions from code review anzz1 2023-03-29 20:20:02 +03:00
  • c447de6937
    rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py Pavol Rusnak 2023-03-29 19:06:19 +02:00
  • d8febc8653 renamed main python script Concedo 2023-03-30 00:48:44 +08:00
  • 664b277c27 integrated libopenblas for greatly accelerated prompt processing. Windows binaries are included - feel free to build your own or to build for other platforms, but that is beyond the scope of this repo. Will fall back to non-blas if libopenblas is removed. Concedo 2023-03-30 00:43:52 +08:00
  • b467702b87
    readme : fix typos Georgi Gerganov 2023-03-29 19:38:31 +03:00
  • 516d88e75c
    readme : add GPT4All instructions (close #588) Georgi Gerganov 2023-03-29 19:37:20 +03:00
  • 53635c081c
    py : add GPT4All conversion script Georgi Gerganov 2023-03-29 19:29:26 +03:00
  • 41318d708e
    llama : use the same threshold for OpenBLAS and ggml thread limiting (#577) Maël Kerbiriou 2023-03-29 18:10:07 +02:00
  • a6956b25a1
    add example of re-act pattern (#583) Tobias Lütke 2023-03-29 17:10:24 +02:00
  • 83df5639eb
    Fix GCC warning about binary literal (#595) master-83df563 anzz1 2023-03-29 16:20:07 +03:00
  • a5c42c4b13
    Fix typo in llama.h (#593) master-a5c42c4 anzz1 2023-03-29 16:19:29 +03:00
  • 49c4c225b5 Merge branch 'master' into concedo Concedo 2023-03-29 21:08:03 +08:00
  • 271307232c Merged PR with a few changes: Concedo 2023-03-29 20:38:57 +08:00
  • b7a3365f4a
    Fix GCC warning about binary literal anzz1 2023-03-29 15:28:26 +03:00
  • 73071045c9
    Fix typo in llama.h anzz1 2023-03-29 14:22:50 +03:00
  • b3a360d80c
    Create chat-13B.bat Thérence 2023-03-29 10:06:56 +02:00
  • ff9b824a2a
    fixed whitespace in reverse prompt issue Tobias Lütke 2023-03-29 09:56:11 +02:00
  • 3f5f4286dd Always initialize mm_addr and mm_length in llama_model Slaren 2023-03-29 08:53:14 +02:00
  • baa529e9c0 Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +02:00
  • e6f1c19937 Make mmap_file static Slaren 2023-03-29 06:18:18 +02:00
  • 7961493a40 Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +02:00
  • ef9afe1540 Add mmap support for model files Slaren 2023-03-29 02:03:43 +02:00
  • 13b4c05d66 Some more code cleanup InconsolableCellist 2023-03-28 16:59:27 -06:00
  • 1041ddb2cd apply PR suggestions Tristan Carel 2023-03-28 22:36:56 +02:00
  • 08121f3aa8 parallelize the quantization process Tristan Carel 2023-03-28 17:46:43 +02:00
  • 9a1ded757b
    plain __cpuid is enough here anzz1 2023-03-29 01:24:48 +03:00
  • bb54708e40
    CI: Re-enable AVX512 testing (Windows-MSVC) anzz1 2023-03-29 01:01:06 +03:00
  • c9c820ff36
    Added support for _POSIX_MAPPED_FILES if defined in source (#564) mmap CoderRC 2023-03-28 17:26:25 -04:00
  • 88c6535377
    spelling... Tobias Lütke 2023-03-28 23:20:11 +02:00
  • c6e8014062
    add example of re-act pattern Tobias Lütke 2023-03-28 23:06:33 +02:00
  • 5a5f8b1501
    Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC (#375) master-5a5f8b1 anzz1 2023-03-28 22:44:29 +03:00
  • f1217055ea
    CI: fix subdirectory path globbing (#546) master-f121705 anzz1 2023-03-28 22:43:25 +03:00
  • 13addf2a78 Merge branch 'concedo' of github.com:InconsolableCellist/llamacpp-for-kobold into concedo InconsolableCellist 2023-03-28 13:43:19 -06:00
  • f7c905b0d0 Minor overhaul of code: InconsolableCellist 2023-03-28 13:39:34 -06:00
  • 003365907d updating to version 17 of embedded koboldAI, and adding host address support InconsolableCellist 2023-03-28 13:39:10 -06:00
  • 8765be59a9
    Update build.yml anzz1 2023-03-28 22:07:08 +03:00
  • 38bc9cef4e
    Merge branch 'mmap' into mmap CoderRC 2023-03-28 15:03:48 -04:00
  • 7f4c5c6651
    llama : fix linkage with mingw (#551) master-7f4c5c6 anzz1 2023-03-28 21:23:09 +03:00
  • 2a98bc18ea
    ggml : add AVX2 implementation of quantize_row_q4_1 (#515) master-2a98bc1 slaren 2023-03-28 20:06:03 +02:00
  • 6ab328d88c
    Make quantize_row_q4_1 static slaren 2023-03-28 20:05:37 +02:00
  • d0aaff571c
    py : add temporary script to convert old ggml files to newer version (#539) master-d0aaff5 thement 2023-03-28 19:55:42 +02:00
  • d0330fd783
    py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning (#403) Tai Duc Nguyen 2023-03-28 13:51:29 -04:00
  • 41669f67d8 Actually use AVX2 Slaren 2023-03-28 19:45:59 +02:00
  • e29652996b Add AVX2 implementation of quantize_row_q4_1 Slaren 2023-03-28 19:41:28 +02:00
  • 1a5ee11377
    Restore old -std= flags Justine Tunney 2023-03-28 10:36:25 -07:00
  • 1631298475
    Remove -std=foo compiler flags Justine Tunney 2023-03-28 10:23:34 -07:00
  • 99c5b27654
    ggml : refactor quantized processing functions (#509) master-99c5b27 Stephan Walter 2023-03-28 17:13:01 +00:00
  • 1229722c61
    Merge branch 'master' into q-refactor Georgi Gerganov 2023-03-28 20:11:56 +03:00
  • a0c2401359
    ggml : minor Georgi Gerganov 2023-03-28 20:10:14 +03:00
  • cbddf4661b
    Get mmap() working with WIN32 MSVC Justine Tunney 2023-03-28 09:27:41 -07:00
  • 692ce3164e
    py : removed unused model variable and verified that the code functions correctly with vocab_only setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. (#547) DooWoong Lee (David) 2023-03-29 02:02:34 +09:00
  • 96f9c0506f
    ci : make ctest verbose, hopefully we see what is wrong with the sanitizer master-96f9c05 Georgi Gerganov 2023-03-28 20:01:09 +03:00
  • d502bc7c9d
    tests : free llama context at the end of the test master-d502bc7 Georgi Gerganov 2023-03-28 19:51:55 +03:00