cal066
|
cc632c3f33
|
AutoAWQ: initial support (#3999)
|
2023-10-05 13:19:18 -03:00 |
|
oobabooga
|
ae4ba3007f
|
Add grammar to transformers and _HF loaders (#4091)
|
2023-10-05 10:01:36 -03:00 |
|
oobabooga
|
b6fe6acf88
|
Add threads_batch parameter
|
2023-10-01 21:28:00 -07:00 |
|
jllllll
|
41a2de96e5
|
Bump llama-cpp-python to 0.2.11
|
2023-10-01 18:08:10 -05:00 |
|
oobabooga
|
56b5a4af74
|
exllamav2 typical_p
|
2023-09-28 20:10:12 -07:00 |
|
StoyanStAtanasov
|
7e6ff8d1f0
|
Enable NUMA feature for llama_cpp_python (#4040)
|
2023-09-26 22:05:00 -03:00 |
|
oobabooga
|
d0d221df49
|
Add --use_fast option (closes #3741)
|
2023-09-25 12:19:43 -07:00 |
|
oobabooga
|
36c38d7561
|
Add disable_exllama to Transformers loader (for GPTQ LoRA training)
|
2023-09-24 20:03:11 -07:00 |
|
oobabooga
|
08cf150c0c
|
Add a grammar editor to the UI (#4061)
|
2023-09-24 18:05:24 -03:00 |
|
oobabooga
|
eb0b7c1053
|
Fix a minor UI bug
|
2023-09-24 07:17:33 -07:00 |
|
oobabooga
|
b227e65d86
|
Add grammar to llama.cpp loader (closes #4019)
|
2023-09-24 07:10:45 -07:00 |
|
saltacc
|
f01b9aa71f
|
Add customizable ban tokens (#3899)
|
2023-09-15 18:27:27 -03:00 |
|
oobabooga
|
2f935547c8
|
Minor changes
|
2023-09-12 15:05:21 -07:00 |
|
oobabooga
|
18e6b275f3
|
Add alpha_value/compress_pos_emb to ExLlama-v2
|
2023-09-12 15:02:47 -07:00 |
|
oobabooga
|
c2a309f56e
|
Add ExLlamaV2 and ExLlamav2_HF loaders (#3881)
|
2023-09-12 14:33:07 -03:00 |
|
oobabooga
|
ed86878f02
|
Remove GGML support
|
2023-09-11 07:44:00 -07:00 |
|
Ravindra Marella
|
e4c3e1bdd2
|
Fix ctransformers model unload (#3711)
Add missing comma in model types list
Fixes marella/ctransformers#111
|
2023-08-27 10:53:48 -03:00 |
|
oobabooga
|
52ab2a6b9e
|
Add rope_freq_base parameter for CodeLlama
|
2023-08-25 06:55:15 -07:00 |
|
oobabooga
|
3320accfdc
|
Add CFG to llamacpp_HF (second attempt) (#3678)
|
2023-08-24 20:32:21 -03:00 |
|
oobabooga
|
d6934bc7bc
|
Implement CFG for ExLlama_HF (#3666)
|
2023-08-24 16:27:36 -03:00 |
|
cal066
|
e042bf8624
|
ctransformers: add mlock and no-mmap options (#3649)
|
2023-08-22 16:51:34 -03:00 |
|
oobabooga
|
7cba000421
|
Bump llama-cpp-python, +tensor_split by @shouyiwang, +mul_mat_q (#3610)
|
2023-08-18 12:03:34 -03:00 |
|
cal066
|
991bb57e43
|
ctransformers: Fix up model_type name consistency (#3567)
|
2023-08-14 15:17:24 -03:00 |
|
oobabooga
|
ccfc02a28d
|
Add the --disable_exllama option for AutoGPTQ (#3545 from clefever/disable-exllama)
|
2023-08-14 15:15:55 -03:00 |
|
Eve
|
66c04c304d
|
Various ctransformers fixes (#3556)
---------
Co-authored-by: cal066 <cal066@users.noreply.github.com>
|
2023-08-13 23:09:03 -03:00 |
|
cal066
|
bf70c19603
|
ctransformers: move thread and seed parameters (#3543)
|
2023-08-13 00:04:03 -03:00 |
|
Chris Lefever
|
0230fa4e9c
|
Add the --disable_exllama option for AutoGPTQ
|
2023-08-12 02:26:58 -04:00 |
|
oobabooga
|
2f918ccf7c
|
Remove unused parameter
|
2023-08-11 11:15:22 -07:00 |
|
oobabooga
|
28c8df337b
|
Add repetition_penalty_range to ctransformers
|
2023-08-11 11:04:19 -07:00 |
|
cal066
|
7a4fcee069
|
Add ctransformers support (#3313)
---------
Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
|
2023-08-11 14:41:33 -03:00 |
|
oobabooga
|
d8fb506aff
|
Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
|
2023-08-08 21:25:48 -07:00 |
|
oobabooga
|
0af10ab49b
|
Add Classifier Free Guidance (CFG) for Transformers/ExLlama (#3325)
|
2023-08-06 17:22:48 -03:00 |
|
oobabooga
|
87dab03dc0
|
Add the --cpu option for llama.cpp to prevent CUDA from being used (#3432)
|
2023-08-03 11:00:36 -03:00 |
|
oobabooga
|
32a2bbee4a
|
Implement auto_max_new_tokens for ExLlama
|
2023-08-02 11:03:56 -07:00 |
|
oobabooga
|
e931844fe2
|
Add auto_max_new_tokens parameter (#3419)
|
2023-08-02 14:52:20 -03:00 |
|
oobabooga
|
84297d05c4
|
Add a "Filter by loader" menu to the Parameters tab
|
2023-07-31 19:09:02 -07:00 |
|
oobabooga
|
b17893a58f
|
Revert "Add tensor split support for llama.cpp (#3171)"
This reverts commit 031fe7225e .
|
2023-07-26 07:06:01 -07:00 |
|
Shouyi
|
031fe7225e
|
Add tensor split support for llama.cpp (#3171)
|
2023-07-25 18:59:26 -03:00 |
|
oobabooga
|
a07d070b6c
|
Add llama-2-70b GGML support (#3285)
|
2023-07-24 16:37:03 -03:00 |
|
randoentity
|
a69955377a
|
[GGML] Support for customizable RoPE (#3083)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-07-17 22:32:37 -03:00 |
|
oobabooga
|
5e3f7e00a9
|
Create llamacpp_HF loader (#3062)
|
2023-07-16 02:21:13 -03:00 |
|
oobabooga
|
e202190c4f
|
lint
|
2023-07-12 11:33:25 -07:00 |
|
Gabriel Pena
|
eedb3bf023
|
Add low vram mode on llama cpp (#3076)
|
2023-07-12 11:05:13 -03:00 |
|
Panchovix
|
10c8c197bf
|
Add Support for Static NTK RoPE scaling for exllama/exllama_hf (#2955)
|
2023-07-04 01:13:16 -03:00 |
|
oobabooga
|
c52290de50
|
ExLlama with long context (#2875)
|
2023-06-25 22:49:26 -03:00 |
|
oobabooga
|
3ae9af01aa
|
Add --no_use_cuda_fp16 param for AutoGPTQ
|
2023-06-23 12:22:56 -03:00 |
|
LarryVRH
|
580c1ee748
|
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777)
|
2023-06-21 15:31:42 -03:00 |
|
oobabooga
|
5f392122fd
|
Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
|
2023-06-16 20:49:36 -03:00 |
|
oobabooga
|
9f40032d32
|
Add ExLlama support (#2444)
|
2023-06-16 20:35:38 -03:00 |
|
oobabooga
|
dea43685b0
|
Add some clarifications
|
2023-06-16 19:10:53 -03:00 |
|
oobabooga
|
7ef6a50e84
|
Reorganize model loading UI completely (#2720)
|
2023-06-16 19:00:37 -03:00 |
|