Slaren
c03ae8dca1
Add mmap support for model files
2023-03-30 12:28:25 -07:00
Stephan Walter
3bcc129ba8
cmake : properly invoke CTest ( #629 )
master-3bcc129
2023-03-30 20:56:59 +03:00
Casey Primozic
a4755cf288
Remove unused variable ( #607 )
...
* It seems some new warning were added recently that exposed this. I wrote the code that included this unused variable originally and it is indeed not needed.
master-a4755cf
2023-03-30 17:53:35 +00:00
david raistrick
1f0414feec
make : fix darwin f16c flags check ( #615 )
...
...there was no check. ported upstream from https://github.com/zanussbaum/gpt4all.cpp/pull/2 (I dont see any clean path for upstream patches)
master-1f0414f
2023-03-30 20:34:45 +03:00
Georgi Gerganov
77efdf5a50
ggml : fix NEON signs ( close #620 , #622 )
master-77efdf5
2023-03-30 20:27:32 +03:00
slaren
ed3c680bcd
Fix GGML_F32Cx8_STORE in AVX without F16C path ( #619 )
master-ed3c680
2023-03-30 11:16:30 +02:00
anzz1
9cbc404ba6
ci : re-enable AVX512 testing (Windows-MSVC) ( #584 )
...
* CI: Re-enable AVX512 testing (Windows-MSVC)
Now with 100% less base64 encoding
* plain __cpuid is enough here
master-9cbc404
2023-03-29 23:44:39 +03:00
Georgi Gerganov
b51c717d5c
ggml : init time on first ggml_init() call
master-b51c717
2023-03-29 22:15:34 +03:00
Georgi Gerganov
0ba76c1e73
llama : fix compile warnings when reading the vocab
master-0ba76c1
2023-03-29 22:13:12 +03:00
Georgi Gerganov
cea1c85948
ggml : add ARM_NEON dequantize_row_q4_1()
master-cea1c85
2023-03-29 22:10:01 +03:00
Georgi Gerganov
f202ada131
ggml : add ARM_NEON quantize_row_q4_1()
master-f202ada
2023-03-29 22:03:07 +03:00
Georgi Gerganov
3b44d30d9b
ggml : add ARM_NEON ggml_vec_dot_q4_1()
2023-03-29 22:03:07 +03:00
Pavol Rusnak
61cbfff5c9
rename convert_ggml_to_pth.py -> convert-ggml-to-pth.py ( #600 )
...
to match filenames of other converters
2023-03-29 20:09:25 +02:00
Thérence
d9ad104440
Create chat-13B.bat ( #592 )
...
* Create chat-13B.bat
Same script than chat-13B.sh, but for windows users.
Tested and working on windows 10/11 v 22H2
* Apply suggestions from code review
---------
Co-authored-by: anzz1 <anzz1@live.com>
2023-03-29 20:21:09 +03:00
Georgi Gerganov
b467702b87
readme : fix typos
2023-03-29 19:38:31 +03:00
Georgi Gerganov
516d88e75c
readme : add GPT4All instructions ( close #588 )
2023-03-29 19:37:20 +03:00
Georgi Gerganov
53635c081c
py : add GPT4All conversion script
...
For now: copy-paste
Too much time for me to deduplicate the python code
2023-03-29 19:29:52 +03:00
Maël Kerbiriou
41318d708e
llama : use the same threshold for OpenBLAS and ggml thread limiting ( #577 )
2023-03-29 19:10:07 +03:00
Tobias Lütke
a6956b25a1
add example of re-act pattern ( #583 )
...
* add example of re-act pattern
* spelling...
* fixed whitespace in reverse prompt issue
2023-03-29 10:10:24 -05:00
anzz1
83df5639eb
Fix GCC warning about binary literal ( #595 )
...
0b10101010 -> 0xAA /* 0b10101010 */
master-83df563
2023-03-29 13:20:07 +00:00
anzz1
a5c42c4b13
Fix typo in llama.h ( #593 )
master-a5c42c4
2023-03-29 13:19:29 +00:00
anzz1
5a5f8b1501
Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC ( #375 )
...
* Enable Fused-Multiply-Add (FMA) instructions on MSVC
__FMA__ macro does not exist in MSVC
* Enable F16C/CVT16 vector extensions on MSVC
__F16C__ macro does not exist in MSVC, but is implied with AVX2/AVX512
* MSVC cvt intrinsics
* Add __SSE3__ macro for MSVC too because why not
even though it's not currently used for anything when AVX is defined
master-5a5f8b1
2023-03-28 22:44:29 +03:00
anzz1
f1217055ea
CI: fix subdirectory path globbing ( #546 )
...
- Changes in subdirectories will now be detecter properly
- (Windows-MSVC) AVX512 tests temporarily disabled
master-f121705
2023-03-28 22:43:25 +03:00
anzz1
7f4c5c6651
llama : fix linkage with mingw ( #551 )
...
* Revert 7e53955 (#542 )
Still needs to be fixed properly
* Fix linking on mingw32
master-7f4c5c6
2023-03-28 21:23:09 +03:00
slaren
2a98bc18ea
ggml : add AVX2 implementation of quantize_row_q4_1 ( #515 )
...
* Add AVX2 implementation of quantize_row_q4_1
* Actually use AVX2
* Make quantize_row_q4_1 static
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
master-2a98bc1
2023-03-28 21:06:03 +03:00
thement
d0aaff571c
py : add temporary script to convert old ggml files to newer version ( #539 )
...
Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>
master-d0aaff5
2023-03-28 20:55:42 +03:00
Tai Duc Nguyen
d0330fd783
py : add capabiliy to convert from ggml back to torch or hf format for further consumption/training/finetuning ( #403 )
2023-03-28 20:51:29 +03:00
Stephan Walter
99c5b27654
ggml : refactor quantized processing functions ( #509 )
...
* Refactor quantized processing functions
* ggml : minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
master-99c5b27
2023-03-28 20:13:01 +03:00
DooWoong Lee (David)
692ce3164e
py : removed unused model
variable and verified that the code functions correctly with vocab_only
setting. Also confirmed that the code works as expected after running with reduced memory usage due to deletion of no-longer-needed variable. ( #547 )
2023-03-28 20:02:34 +03:00
Georgi Gerganov
96f9c0506f
ci : make ctest verbose, hopefully we see what is wrong with the sanitizer
master-96f9c05
2023-03-28 20:01:09 +03:00
Georgi Gerganov
d502bc7c9d
tests : free llama context at the end of the test
master-d502bc7
2023-03-28 19:51:55 +03:00
Stephan Walter
436e561931
all : be more strict about converting float to double ( #458 )
...
* Be more strict about converting float to double
* Test equivalence of round, SILU implementations
Test module is commented out in CMakeLists.txt because the tests may
take a long time, depending on how much the compiler optimizes.
* Fix softmax in perplexity.cpp
* all : prefer float over double where appropriate
* perplexity : add <cmath>
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
master-436e561
2023-03-28 19:48:20 +03:00
Jed Fox
20e1e84884
deploy : add a Package.swift for SwiftPM support ( #393 )
...
* Add a Package.swift for SwiftPM support
* Swap from exclusions to allowlist
master-20e1e84
2023-03-28 19:39:01 +03:00
Stephan Walter
c1f885067c
ggml : introduce structs for the q4 data blocks ( #356 )
...
* Introduce structs for the q4 data blocks
* ggml : rename quant struct variables + fix ARM_NEON
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
master-c1f8850
2023-03-28 18:56:03 +03:00
Georgi Gerganov
e0670260fb
gitignore : add "embedding"
2023-03-28 18:34:35 +03:00
dotpy314
28ba975aea
Check the existence of f16_model_path_base in quantize.py ( #574 )
...
Co-authored-by: Jincheng Miao <jincheng.miao@gmail.com>
2023-03-28 18:06:28 +03:00
slaren
a6bdc47cba
Fix usage of F16C intrinsics in AVX code ( #563 )
...
* Fix usage of F16C intrinsics in AVX code when F16C is not defined
master-a6bdc47
2023-03-28 17:26:55 +03:00
anzz1
7b8dbcb78b
main.cpp fixes, refactoring ( #571 )
...
- main: entering empty line passes back control without new input in interactive/instruct modes
- instruct mode: keep prompt fix
- instruct mode: duplicate instruct prompt fix
- refactor: move common console code from main->common
master-7b8dbcb
2023-03-28 17:09:55 +03:00
RJ Adriaansen
4b8efff0e3
Add embedding example to Makefile ( #540 )
master-4b8efff
2023-03-28 09:11:09 +03:00
Marco Matthies
7e5395575a
Fix missing ggml link in cmake for examples/* on w64-mingw32 ( #542 )
2023-03-27 07:55:26 +03:00
Erik Scholz
34c1072e49
ci: add debug build to sanitizer build matrix ( #527 )
master-34c1072
2023-03-26 15:48:40 +00:00
Stephan Walter
939ad2d3a5
Fix undefined variables in debug build, remove unused variables ( #531 )
master-939ad2d
2023-03-26 15:34:02 +00:00
Juan Calderon-Perez
8c2ec5e21d
Add support for linux/arm64 platform during Docker Builds ( #514 )
...
* Add support for linux/arm64 platform
* Add platform to versioned builds
master-8c2ec5e
2023-03-26 14:48:42 +00:00
Stephan Walter
b391579db9
Update README and comments for standalone perplexity tool ( #525 )
master-b391579
2023-03-26 16:14:01 +03:00
anzz1
7a87d31f4f
[main] fix infinite generation (-n == -1) ( #523 )
master-7a87d31
2023-03-26 16:06:10 +03:00
Georgi Gerganov
348d6926ee
Add logo to README.md
2023-03-26 10:20:49 +03:00
Harald Fernengel
33e35b8fe8
Exit from interactive mode if input stream is bad ( #491 )
...
Allow exiting the interactive prompt also with CTRL-D on Unix and CTRL-Z
on Windows.
master-33e35b8
2023-03-26 08:25:46 +03:00
anzz1
19726169b3
CI: Run other sanitizer builds even if one fails ( #511 )
...
applies only to sanitizer builds so they wont be cancelled
master-1972616
2023-03-26 00:13:28 +02:00
jp-x-g
f732695cd5
Clarify console output in convert-pth-to-ggml.py ( #512 )
...
"Processing part 1 of 3" instead of "Processing part 0"
2023-03-25 23:53:55 +02:00
anzz1
2f7bf7dd7c
CMake / CI additions ( #497 )
...
* CMake: Add AVX512 option
* CI: Add AVX/AVX512 builds (Windows)
(AVX512 tests can only be run when the worker happens to support it, building works anyway)
* CMake: Fix sanitizer linkage ( merged #468 )
* CI: Add sanitizer builds (Ubuntu)
* CI: Fix release tagging
(change @zendesk/action-create-release to @anzz1/action-create-release until upstream PR Added commitish as input zendesk/action-create-release#32 is merged)
master-2f7bf7d
2023-03-25 23:38:11 +02:00