oobabooga
|
c0b5c09860
|
Minor change
|
2023-04-22 15:15:31 -03:00 |
|
oobabooga
|
fcb594b90e
|
Don't require llama.cpp models to be placed in subfolders
|
2023-04-22 14:56:48 -03:00 |
|
oobabooga
|
c4f4f41389
|
Add an "Evaluate" tab to calculate the perplexities of models (#1322)
|
2023-04-21 00:20:33 -03:00 |
|
oobabooga
|
7bb9036ac9
|
Add universal LLaMA tokenizer support
|
2023-04-19 21:23:51 -03:00 |
|
catalpaaa
|
07de7d0426
|
Load llamacpp before quantized model (#1307)
|
2023-04-17 10:47:26 -03:00 |
|
oobabooga
|
39099663a0
|
Add 4-bit LoRA support (#1200)
|
2023-04-16 23:26:52 -03:00 |
|
Forkoz
|
c6fe1ced01
|
Add ChatGLM support (#1256)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-04-16 19:15:03 -03:00 |
|
oobabooga
|
ac189011cb
|
Add "Save current settings for this model" button
|
2023-04-15 12:54:02 -03:00 |
|
oobabooga
|
cacbcda208
|
Two new options: truncation length and ban eos token
|
2023-04-11 18:46:06 -03:00 |
|
oobabooga
|
1911504f82
|
Minor bug fix
|
2023-04-09 23:45:41 -03:00 |
|
oobabooga
|
dba2000d2b
|
Do things that I am not proud of
|
2023-04-09 23:40:49 -03:00 |
|
MarkovInequality
|
992663fa20
|
Added xformers support to Llama (#950)
|
2023-04-09 23:08:40 -03:00 |
|
oobabooga
|
a3085dba07
|
Fix LlamaTokenizer eos_token (attempt)
|
2023-04-09 21:19:39 -03:00 |
|
oobabooga
|
0b458bf82d
|
Simplify a function
|
2023-04-07 21:37:41 -03:00 |
|
Φφ
|
ffd102e5c0
|
SD Api Pics extension, v.1.1 (#596)
|
2023-04-07 21:36:04 -03:00 |
|
oobabooga
|
ea6e77df72
|
Make the code more like PEP8 for readability (#862)
|
2023-04-07 00:15:45 -03:00 |
|
oobabooga
|
113f94b61e
|
Bump transformers (16-bit llama must be reconverted/redownloaded)
|
2023-04-06 16:04:03 -03:00 |
|
oobabooga
|
03cb44fc8c
|
Add new llama.cpp library (2048 context, temperature, etc now work)
|
2023-04-06 13:12:14 -03:00 |
|
catalpaaa
|
4ab679480e
|
allow quantized model to be loaded from model dir (#760)
|
2023-04-04 23:19:38 -03:00 |
|
oobabooga
|
3a47a602a3
|
Detect ggml*.bin files automatically
|
2023-03-31 17:18:21 -03:00 |
|
oobabooga
|
4c27562157
|
Minor changes
|
2023-03-31 14:33:46 -03:00 |
|
Thomas Antony
|
79fa2b6d7e
|
Add support for alpaca
|
2023-03-30 11:23:04 +01:00 |
|
Thomas Antony
|
7745faa7bb
|
Add llamacpp to models.py
|
2023-03-30 11:22:37 +01:00 |
|
oobabooga
|
1cb9246160
|
Adapt to the new model names
|
2023-03-29 21:47:36 -03:00 |
|
oobabooga
|
53da672315
|
Fix FlexGen
|
2023-03-27 23:44:21 -03:00 |
|
oobabooga
|
ee95e55df6
|
Fix RWKV tokenizer
|
2023-03-27 23:42:29 -03:00 |
|
oobabooga
|
fde92048af
|
Merge branch 'main' into catalpaaa-lora-and-model-dir
|
2023-03-27 23:16:44 -03:00 |
|
oobabooga
|
49c10c5570
|
Add support for the latest GPTQ models with group-size (#530)
**Warning: old 4-bit weights will not work anymore!**
See here how to get up to date weights: https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#step-2-get-the-pre-converted-weights
|
2023-03-26 00:11:33 -03:00 |
|
catalpaaa
|
b37c54edcf
|
lora-dir, model-dir and login auth
Added lora-dir, model-dir, and a login auth arguments that points to a file contains usernames and passwords in the format of "u:pw,u:pw,..."
|
2023-03-24 17:30:18 -07:00 |
|
oobabooga
|
a6bf54739c
|
Revert models.py (accident)
|
2023-03-24 19:56:45 -03:00 |
|
oobabooga
|
a80aa65986
|
Update models.py
|
2023-03-24 19:53:20 -03:00 |
|
oobabooga
|
ddb62470e9
|
--no-cache and --gpu-memory in MiB for fine VRAM control
|
2023-03-19 19:21:41 -03:00 |
|
oobabooga
|
e26763a510
|
Minor changes
|
2023-03-17 22:56:46 -03:00 |
|
Wojtek Kowaluk
|
7994b580d5
|
clean up duplicated code
|
2023-03-18 02:27:26 +01:00 |
|
Wojtek Kowaluk
|
30939e2aee
|
add mps support on apple silicon
|
2023-03-18 00:56:23 +01:00 |
|
oobabooga
|
ee164d1821
|
Don't split the layers in 8-bit mode by default
|
2023-03-16 18:22:16 -03:00 |
|
oobabooga
|
e085cb4333
|
Small changes
|
2023-03-16 13:34:23 -03:00 |
|
awoo
|
83cb20aad8
|
Add support for --gpu-memory witn --load-in-8bit
|
2023-03-16 18:42:53 +03:00 |
|
oobabooga
|
1c378965e1
|
Remove unused imports
|
2023-03-16 10:18:34 -03:00 |
|
oobabooga
|
66256ac1dd
|
Make the "no GPU has been detected" message more descriptive
|
2023-03-15 19:31:27 -03:00 |
|
oobabooga
|
265ba384b7
|
Rename a file, add deprecation warning for --load-in-4bit
|
2023-03-14 07:56:31 -03:00 |
|
Ayanami Rei
|
8778b756e6
|
use updated load_quantized
|
2023-03-13 22:11:40 +03:00 |
|
Ayanami Rei
|
e1c952c41c
|
make argument non case-sensitive
|
2023-03-13 20:22:38 +03:00 |
|
Ayanami Rei
|
3c9afd5ca3
|
rename method
|
2023-03-13 20:14:40 +03:00 |
|
Ayanami Rei
|
edbc61139f
|
use new quant loader
|
2023-03-13 20:00:38 +03:00 |
|
oobabooga
|
65dda28c9d
|
Rename --llama-bits to --gptq-bits
|
2023-03-12 11:19:07 -03:00 |
|
oobabooga
|
fed3617f07
|
Move LLaMA 4-bit into a separate file
|
2023-03-12 11:12:34 -03:00 |
|
draff
|
001e638b47
|
Make it actually work
|
2023-03-10 23:28:19 +00:00 |
|
draff
|
804486214b
|
Re-implement --load-in-4bit and update --llama-bits arg description
|
2023-03-10 23:21:01 +00:00 |
|
ItsLogic
|
9ba8156a70
|
remove unnecessary Path()
|
2023-03-10 22:33:58 +00:00 |
|