Commit Graph

  • 445c0cd75a
    Update README.md Shreyas T 2023-04-02 09:53:31 +0530
  • 5b70e7de4c
    fix default params for examples/main (#697) master-5b70e7d Murilo Santana 2023-04-01 23:41:12 -0300
  • cd12f6e7be Change sys.exit(1) to raise SystemExit(1) mrcichon 2023-04-02 04:11:39 +0200
  • f87d539a6d fix default params for examples/main Murilo Santana 2023-04-01 22:29:16 -0300
  • 452b6ab115 Use safe loading for .pth checkpoint albanD 2023-04-01 20:25:28 -0400
  • d282143a87
    Add a missing step to the gpt4all instructions Thatcher Chamberlin 2023-04-01 17:04:52 -0400
  • 9523d72b56
    top_k = 1 since it is an integer Fabio Rossini Sluzala 2023-04-01 17:21:37 -0300
  • 3300247e97
    Fix for temp == 0 Fabio Rossini Sluzala 2023-04-01 17:01:12 -0300
  • f370a670be Reviewer suggestion: Moved to examples Sebastian Apel 2023-04-01 21:18:42 +0200
  • 5833baeeec Feature: Param for numer of iterations, Bugfix for use of parameter threads Sebastian Apel 2023-04-01 21:06:47 +0200
  • 100dc551e1 Review comment: Removed set_locale Sebastian Apel 2023-04-01 20:33:11 +0200
  • 6e691af997 Reviewer input: removed rtsc, use epsilon for check Sebastian Apel 2023-04-01 20:27:19 +0200
  • d3bc4df97d fix windows build Vladimir 2023-04-01 19:45:33 +0200
  • a65d37ad36 using github Pithikos/C-Thread-Pool for threading Vladimir 2023-03-31 21:03:48 +0200
  • 21e88c8b0f
    run sanitizers in release, otherwise too slow (#5) Vladimir 2023-04-01 20:16:36 +0200
  • b1f08813e3 added support for gpt4all original format Concedo 2023-04-02 00:53:46 +0800
  • a463fb7668
    Update llama.h Christian Falch 2023-04-01 18:46:47 +0200
  • 17f463a083
    Update llama.h Christian Falch 2023-04-01 18:46:37 +0200
  • f411251bcf
    Update llama.cpp Christian Falch 2023-04-01 18:46:24 +0200
  • a0c895c087
    Update llama.cpp Christian Falch 2023-04-01 18:46:14 +0200
  • a717cba844
    py: huggingface -> Hugging Face (#686) Ikko Eltociear Ashimine 2023-04-02 01:38:18 +0900
  • c928ab8a38 Allow larger tensor sizes. Marian Cepok 2023-04-01 18:36:47 +0200
  • 38f9d02d52 Fix quantization from FP16 saharNooby 2023-04-01 20:01:06 +0400
  • 458f7cd7ab
    update convert-ggml-to-pth.py Ikko Eltociear Ashimine 2023-04-02 00:44:40 +0900
  • 14804b7978 Added api for retrieving and setting the kv cache chrfalch 2023-04-01 17:39:17 +0200
  • 972e28d48d Implement INT4 conversion and inference saharNooby 2023-04-01 19:22:01 +0400
  • 3ef747808a Be nice to CI machines by not allocating buffers Stephan Walter 2023-04-01 17:00:01 +0200
  • d0a7f742e7
    readme: replace termux links with homepage, play store is deprecated (#680) rimoliga 2023-04-01 11:57:30 -0300
  • 889eaac8f8
    Update README.md rimoliga 2023-04-01 11:47:34 -0300
  • 0d054e292e Show error message when -f fails master-0d054e2 Slaren 2023-03-31 20:03:48 +0200
  • b164bf4e27 Allocate memory as needed for specific configuration of model saharNooby 2023-04-01 17:15:23 +0400
  • ab9ad077c2
    Update README.md rimoliga 2023-04-01 10:03:28 -0300
  • d5349f8735 Fix Windows build by not using variable array sizes Stephan Walter 2023-04-01 14:36:28 +0200
  • a1e1d34c93 Add Python wrapper for C library saharNooby 2023-04-01 16:02:22 +0400
  • 39f91e3f6e Clean up QK and file and tensor types Stephan Walter 2023-04-01 14:00:24 +0200
  • 7130a89d1f [FILE FORMAT CHANGED] Reverse dimensions in ggml file (makes it more similar to llama.cpp format) saharNooby 2023-04-01 14:41:30 +0400
  • ac03019fcf Move model to separate C library file saharNooby 2023-04-01 14:38:50 +0400
  • f6d45baec0 Support FP16 inference saharNooby 2023-04-01 11:53:49 +0400
  • fe98c94a63 [FILE FORMAT CHANGED] Use ggml_get_rows to get embedding saharNooby 2023-04-01 11:28:32 +0400
  • 16ec7a5c18 Add fail-fast version of the test saharNooby 2023-04-01 11:15:15 +0400
  • 0fcb7c64c6 Remove reference implementation code and test against pre-created logits saharNooby 2023-04-01 11:09:24 +0400
  • 0fb5433e97 miss some changes in the previous commit Howard Su 2023-04-01 15:09:06 +0800
  • bf88e8a246 Update README.md saharNooby 2023-04-01 10:12:10 +0400
  • 6fe9486cee Finally, FP32 inference saharNooby 2023-04-01 10:06:39 +0400
  • 085a9f90a7 still refactoring Concedo 2023-04-01 11:56:34 +0800
  • 6e6125ebdb updated pyinstaller to clean temp dir,removed warning flags from makefile because they are just clutter. Concedo 2023-04-01 09:25:41 +0800
  • 9ab6e87b58 Merge branch 'master' into concedo Concedo 2023-04-01 09:05:45 +0800
  • 801b178f2a still refactoring, but need a checkpoint to prepare build for 1.0.7 Concedo 2023-04-01 08:55:14 +0800
  • 6a498f0d79 Remove torch GPU dependencies bsilvereagle 2023-03-31 15:51:41 -0700
  • b09da81d52
    Use getopts for example scripts Ben Siraphob 2023-03-31 17:37:33 -0500
  • 3525899277
    Enable -std= for cmake builds, fix warnings (#598) master-3525899 Stephan Walter 2023-03-31 19:19:16 +0000
  • e80b06305d Enable -std= for cmake builds, fix warnings Stephan Walter 2023-03-29 17:44:04 +0200
  • 7e30f52600
    examples: add gpt4all script Leonardo Neumann 2023-03-31 15:51:20 -0300
  • 2d2d61568c Show error message when -f fails Slaren 2023-03-31 20:03:48 +0200
  • 41e8d2b434 Move constant out of loop Howard Su 2023-04-01 01:51:44 +0800
  • 8febfc73af Fix inplace version of operators Howard Su 2023-04-01 01:26:48 +0800
  • 6b86f5ea22 halfway refactoring, wip adding other model types Concedo 2023-04-01 01:13:05 +0800
  • 61c6b1a4e0 Add comparison against reference implementation script, implement state & logits saving saharNooby 2023-03-31 20:23:42 +0400
  • d00f28581a Add reference implementation of RWKV RNN saharNooby 2023-03-31 19:57:16 +0400
  • 1d08882afa
    Optimize AVX2 ggml_vec_dot_q4_0 (#642) master-1d08882 slaren 2023-03-31 17:55:52 +0200
  • fd2f59a03d Reviewer requests: added parameter for threads, switched to ggml_time_us() Sebastian Apel 2023-03-31 17:38:19 +0200
  • 02c9946b57 Update README.md saharNooby 2023-03-31 19:06:31 +0400
  • 01d667f066 Implement exp, max, 1_minus_x, sigmoid operators in ggml saharNooby 2023-03-31 19:04:35 +0400
  • 3b7dcc0fc8 Bugfix: Added dependency to ggml.o to benchmark Sebastian Apel 2023-03-31 16:58:54 +0200
  • 2877517fe8
    Merge branch 'ggerganov:master' into master SebastianApel 2023-03-31 16:44:48 +0200
  • bcf363cb53 Optimize model to leverage inplace to avoid create new tensor Howard Su 2023-03-31 22:42:09 +0800
  • ed5f4fe00e Initial version of q4_0 matrix multiplication benchmark Sebastian Apel 2023-03-31 16:39:39 +0200
  • fed6b5da76
    Fix memory bugs in loading code Justine Tunney 2023-03-30 19:43:41 -0700
  • 02c5b27e91
    Add AVX acceleration (#617) master-02c5b27 perserk 2023-03-31 16:55:44 +0500
  • 56949197fe added HF converter base Concedo 2023-03-31 19:10:21 +0800
  • 17044257a0 Merge branch 'master' into concedo Concedo 2023-03-31 19:04:47 +0800
  • 559a1967f7 Backwards compatibility formats all done Concedo 2023-03-31 19:01:33 +0800
  • 9eab39fe6d prepare legacy functions (+1 squashed commits) Concedo 2023-03-31 16:37:39 +0800
  • 40c7dd19e3 Use -march=native -mtune=native on x86. Also enables AVX512 on macOS. Fabian 2023-03-30 00:08:23 +0200
  • cbef542879 py : cleanup the code Pavol Rusnak 2023-03-29 21:31:24 +0200
  • 79f9743347 improved console info, fixed utf encoding bugs Concedo 2023-03-31 15:38:38 +0800
  • fe272dc3d3 Minor changes saharNooby 2023-03-31 10:24:12 +0400
  • 3e90f37626 Optimize AVX2 ggml_vec_dot_q4_0 Slaren 2023-03-31 02:51:49 +0200
  • 1604abdad2
    py : cleanup the code Pavol Rusnak 2023-03-29 21:31:24 +0200
  • 9733104be5 drop quantize.py (now that models are using a single file) Pavol Rusnak 2023-03-31 00:52:06 +0200
  • f4c4d29d72
    drop quantize.py (now that models are using a single file) Pavol Rusnak 2023-03-31 00:52:06 +0200
  • e968c80f5d Link with cblas when LLAMA_OPENBLAS is enabled. KerfuffleV2 2023-03-30 13:42:15 -0600
  • 3df890aef4
    readme : update supported models Georgi Gerganov 2023-03-30 22:31:54 +0300
  • ee0c40dd6d Introduce GGML migration tool for new file format master-ee0c40d Justine Tunney 2023-03-30 05:42:56 -0700
  • 6f23ba5ee2 Ensure --mlock works properly with mmap() support Justine Tunney 2023-03-30 01:53:36 -0700
  • 78ca9838ee Make loading weights 10-100x faster Justine Tunney 2023-03-29 13:51:37 -0700
  • a017390358 Initial windows support (untested) Slaren 2023-03-29 22:22:36 +0200
  • ac184d5147 Always initialize mm_addr and mm_length in llama_model Slaren 2023-03-29 08:53:14 +0200
  • 276e5b7811 Unmap the file in llama_free Slaren 2023-03-29 08:31:26 +0200
  • d68c5dc435 Make mmap_file static Slaren 2023-03-29 06:18:18 +0200
  • 64bde3ffd4 Fix ggml_init_params in quantize Slaren 2023-03-29 05:38:57 +0200
  • c03ae8dca1 Add mmap support for model files Slaren 2023-03-29 02:03:43 +0200
  • 516474b465
    Introduce GGML migration tool for new file format Justine Tunney 2023-03-30 05:42:56 -0700
  • 85e8395944
    SWAP info added Jaime R 2023-03-30 21:25:21 +0300
  • 3bcc129ba8
    cmake : properly invoke CTest (#629) master-3bcc129 Stephan Walter 2023-03-30 17:56:59 +0000
  • a4755cf288
    Remove unused variable (#607) master-a4755cf Casey Primozic 2023-03-30 10:53:35 -0700
  • 1f0414feec
    make : fix darwin f16c flags check (#615) master-1f0414f david raistrick 2023-03-30 13:34:45 -0400
  • 77efdf5a50
    ggml : fix NEON signs (close #620, #622) master-77efdf5 Georgi Gerganov 2023-03-30 20:27:32 +0300
  • 44aea7752b Properly invoke CTest Stephan Walter 2023-03-30 19:06:54 +0200
  • 80dad7923e ggml : refactor AVX part of ggml_vec_dot_q4_0() Sergey Pershukov 2023-03-30 21:44:34 +0500