oobabooga
|
923c8e25fb
|
Bump llama-cpp-python to 0.2.18 (#4611)
|
2023-11-16 22:55:14 -03:00 |
|
oobabooga
|
1b69694fe9
|
Add types to the encode/decode/token-count endpoints
|
2023-11-07 19:32:14 -08:00 |
|
tdrussell
|
72f6fc6923
|
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty (#4376)
|
2023-10-25 12:10:28 -03:00 |
|
oobabooga
|
df90d03e0b
|
Replace --mul_mat_q with --no_mul_mat_q
|
2023-10-22 12:23:03 -07:00 |
|
oobabooga
|
b6fe6acf88
|
Add threads_batch parameter
|
2023-10-01 21:28:00 -07:00 |
|
jllllll
|
41a2de96e5
|
Bump llama-cpp-python to 0.2.11
|
2023-10-01 18:08:10 -05:00 |
|
StoyanStAtanasov
|
7e6ff8d1f0
|
Enable NUMA feature for llama_cpp_python (#4040)
|
2023-09-26 22:05:00 -03:00 |
|
oobabooga
|
019371c0b6
|
Lint
|
2023-09-25 20:31:11 -07:00 |
|
oobabooga
|
08cf150c0c
|
Add a grammar editor to the UI (#4061)
|
2023-09-24 18:05:24 -03:00 |
|
oobabooga
|
3edac43426
|
Remove print statement
|
2023-09-24 07:13:00 -07:00 |
|
oobabooga
|
b227e65d86
|
Add grammar to llama.cpp loader (closes #4019)
|
2023-09-24 07:10:45 -07:00 |
|
oobabooga
|
2e7b6b0014
|
Create alternative requirements.txt with AMD and Metal wheels (#4052)
|
2023-09-24 09:58:29 -03:00 |
|
Cebtenzzre
|
8466cf229a
|
llama.cpp: fix ban_eos_token (#3987)
|
2023-09-18 12:15:02 -03:00 |
|
oobabooga
|
d9b0f2c9c3
|
Fix llama.cpp double decoding
|
2023-09-17 13:07:48 -07:00 |
|
oobabooga
|
ad8ac545a5
|
Tokenization improvements
|
2023-09-17 07:02:00 -07:00 |
|
saltacc
|
cd08eb0753
|
token probs for non HF loaders (#3957)
|
2023-09-17 10:42:32 -03:00 |
|
saltacc
|
f01b9aa71f
|
Add customizable ban tokens (#3899)
|
2023-09-15 18:27:27 -03:00 |
|
oobabooga
|
ed86878f02
|
Remove GGML support
|
2023-09-11 07:44:00 -07:00 |
|
oobabooga
|
7f5370a272
|
Minor fixes/cosmetics
|
2023-08-26 22:11:07 -07:00 |
|
jllllll
|
4d61a7d9da
|
Account for deprecated GGML parameters
|
2023-08-26 14:07:46 -05:00 |
|
jllllll
|
4a999e3bcd
|
Use separate llama-cpp-python packages for GGML support
|
2023-08-26 10:40:08 -05:00 |
|
oobabooga
|
52ab2a6b9e
|
Add rope_freq_base parameter for CodeLlama
|
2023-08-25 06:55:15 -07:00 |
|
Cebtenzzre
|
942ad6067d
|
llama.cpp: make Stop button work with streaming disabled (#3620)
|
2023-08-19 00:17:27 -03:00 |
|
oobabooga
|
7cba000421
|
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q (#3610)
|
2023-08-18 12:03:34 -03:00 |
|
oobabooga
|
65aa11890f
|
Refactor everything (#3481)
|
2023-08-06 21:49:27 -03:00 |
|
Pete
|
f4005164f4
|
Fix llama.cpp truncation (#3400)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-08-03 20:01:15 -03:00 |
|
oobabooga
|
87dab03dc0
|
Add the --cpu option for llama.cpp to prevent CUDA from being used (#3432)
|
2023-08-03 11:00:36 -03:00 |
|
oobabooga
|
b17893a58f
|
Revert "Add tensor split support for llama.cpp (#3171)"
This reverts commit 031fe7225e .
|
2023-07-26 07:06:01 -07:00 |
|
Shouyi
|
031fe7225e
|
Add tensor split support for llama.cpp (#3171)
|
2023-07-25 18:59:26 -03:00 |
|
oobabooga
|
a07d070b6c
|
Add llama-2-70b GGML support (#3285)
|
2023-07-24 16:37:03 -03:00 |
|
jllllll
|
1141987a0d
|
Add checks for ROCm and unsupported architectures to llama_cpp_cuda loading (#3225)
|
2023-07-24 11:25:36 -03:00 |
|
oobabooga
|
4b19b74e6c
|
Add CUDA wheels for llama-cpp-python by jllllll
|
2023-07-19 19:33:43 -07:00 |
|
randoentity
|
a69955377a
|
[GGML] Support for customizable RoPE (#3083)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-07-17 22:32:37 -03:00 |
|
Gabriel Pena
|
eedb3bf023
|
Add low vram mode on llama cpp (#3076)
|
2023-07-12 11:05:13 -03:00 |
|
oobabooga
|
b6643e5039
|
Add decode functions to llama.cpp/exllama
|
2023-07-07 09:11:30 -07:00 |
|
EugeoSynthesisThirtyTwo
|
7625c6de89
|
fix usage of self in classmethod (#2781)
|
2023-06-20 16:18:42 -03:00 |
|
Cebtenzzre
|
59e7ecb198
|
llama.cpp: implement ban_eos_token via logits_processor (#2765)
|
2023-06-19 21:31:19 -03:00 |
|
oobabooga
|
05a743d6ad
|
Make llama.cpp use tfs parameter
|
2023-06-17 19:08:25 -03:00 |
|
oobabooga
|
9f40032d32
|
Add ExLlama support (#2444)
|
2023-06-16 20:35:38 -03:00 |
|
oobabooga
|
6015616338
|
Style changes
|
2023-06-06 13:06:05 -03:00 |
|
DGdev91
|
cf088566f8
|
Make llama.cpp read prompt size and seed from settings (#2299)
|
2023-05-25 10:29:31 -03:00 |
|
oobabooga
|
c0fd7f3257
|
Add mirostat parameters for llama.cpp (#2287)
|
2023-05-22 19:37:24 -03:00 |
|
oobabooga
|
e116d31180
|
Prevent unwanted log messages from modules
|
2023-05-21 22:42:34 -03:00 |
|
Andrei
|
e657dd342d
|
Add in-memory cache support for llama.cpp (#1936)
|
2023-05-15 20:19:55 -03:00 |
|
Jakub Strnad
|
0227e738ed
|
Add settings UI for llama.cpp and fixed reloading of llama.cpp models (#2087)
|
2023-05-15 19:51:23 -03:00 |
|
AlphaAtlas
|
071f0776ad
|
Add llama.cpp GPU offload option (#2060)
|
2023-05-14 22:58:11 -03:00 |
|
Ahmed Said
|
fbcd32988e
|
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative (#1649)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-02 18:25:28 -03:00 |
|
oobabooga
|
ea6e77df72
|
Make the code more like PEP8 for readability (#862)
|
2023-04-07 00:15:45 -03:00 |
|
oobabooga
|
2c52310642
|
Add --threads flag for llama.cpp
|
2023-03-31 21:18:05 -03:00 |
|
oobabooga
|
52065ae4cd
|
Add repetition_penalty
|
2023-03-31 19:01:34 -03:00 |
|