oobabooga
|
9331ab4798
|
Read GGUF metadata (#3873)
|
2023-09-11 18:49:30 -03:00 |
|
oobabooga
|
ed86878f02
|
Remove GGML support
|
2023-09-11 07:44:00 -07:00 |
|
jllllll
|
4a999e3bcd
|
Use separate llama-cpp-python packages for GGML support
|
2023-08-26 10:40:08 -05:00 |
|
oobabooga
|
83640d6f43
|
Replace ggml occurences with gguf
|
2023-08-26 01:06:59 -07:00 |
|
cal066
|
960980247f
|
ctransformers: gguf support (#3685)
|
2023-08-25 11:33:04 -03:00 |
|
oobabooga
|
52ab2a6b9e
|
Add rope_freq_base parameter for CodeLlama
|
2023-08-25 06:55:15 -07:00 |
|
cal066
|
7a4fcee069
|
Add ctransformers support (#3313)
---------
Co-authored-by: cal066 <cal066@users.noreply.github.com>
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
Co-authored-by: randoentity <137087500+randoentity@users.noreply.github.com>
|
2023-08-11 14:41:33 -03:00 |
|
oobabooga
|
d8fb506aff
|
Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
|
2023-08-08 21:25:48 -07:00 |
|
oobabooga
|
65aa11890f
|
Refactor everything (#3481)
|
2023-08-06 21:49:27 -03:00 |
|
oobabooga
|
75c2dd38cf
|
Remove flexgen support
|
2023-07-25 15:15:29 -07:00 |
|
appe233
|
89e0d15cf5
|
Use 'torch.backends.mps.is_available' to check if mps is supported (#3164)
|
2023-07-17 21:27:18 -03:00 |
|
oobabooga
|
5e3f7e00a9
|
Create llamacpp_HF loader (#3062)
|
2023-07-16 02:21:13 -03:00 |
|
oobabooga
|
e202190c4f
|
lint
|
2023-07-12 11:33:25 -07:00 |
|
FartyPants
|
9b55d3a9f9
|
More robust and error prone training (#3058)
|
2023-07-12 15:29:43 -03:00 |
|
oobabooga
|
5ac4e4da8b
|
Make --model work with argument like models/folder_name
|
2023-07-08 10:22:54 -07:00 |
|
Xiaojian "JJ" Deng
|
ff45317032
|
Update models.py (#3020)
Hopefully fixed error with "ValueError: Tokenizer class GPTNeoXTokenizer does not exist or is not currently
imported."
|
2023-07-05 21:40:43 -03:00 |
|
oobabooga
|
8705eba830
|
Remove universal llama tokenizer support
Instead replace it with a warning if the tokenizer files look off
|
2023-07-04 19:43:19 -07:00 |
|
FartyPants
|
33f56fd41d
|
Update models.py to clear LORA names after unload (#2951)
|
2023-07-03 17:39:06 -03:00 |
|
oobabooga
|
f0fcd1f697
|
Sort some imports
|
2023-06-25 01:44:36 -03:00 |
|
Panchovix
|
5646690769
|
Fix some models not loading on exllama_hf (#2835)
|
2023-06-23 11:31:02 -03:00 |
|
LarryVRH
|
580c1ee748
|
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777)
|
2023-06-21 15:31:42 -03:00 |
|
ThisIsPIRI
|
def3b69002
|
Fix loading condition for universal llama tokenizer (#2753)
|
2023-06-18 18:14:06 -03:00 |
|
oobabooga
|
9f40032d32
|
Add ExLlama support (#2444)
|
2023-06-16 20:35:38 -03:00 |
|
oobabooga
|
7ef6a50e84
|
Reorganize model loading UI completely (#2720)
|
2023-06-16 19:00:37 -03:00 |
|
oobabooga
|
00b94847da
|
Remove softprompt support
|
2023-06-06 07:42:23 -03:00 |
|
oobabooga
|
f276d88546
|
Use AutoGPTQ by default for GPTQ models
|
2023-06-05 15:41:48 -03:00 |
|
oobabooga
|
3578dd3611
|
Change a warning message
|
2023-05-29 22:40:54 -03:00 |
|
Luis Lopez
|
9e7204bef4
|
Add tail-free and top-a sampling (#2357)
|
2023-05-29 21:40:01 -03:00 |
|
Forkoz
|
60ae80cf28
|
Fix hang in tokenizer for AutoGPTQ llama models. (#2399)
|
2023-05-28 23:10:10 -03:00 |
|
oobabooga
|
361451ba60
|
Add --load-in-4bit parameter (#2320)
|
2023-05-25 01:14:13 -03:00 |
|
oobabooga
|
cd3618d7fb
|
Add support for RWKV in Hugging Face format
|
2023-05-23 02:07:28 -03:00 |
|
oobabooga
|
e116d31180
|
Prevent unwanted log messages from modules
|
2023-05-21 22:42:34 -03:00 |
|
oobabooga
|
05593a7834
|
Minor bug fix
|
2023-05-20 23:22:36 -03:00 |
|
oobabooga
|
9d5025f531
|
Improve error handling while loading GPTQ models
|
2023-05-19 11:20:08 -03:00 |
|
oobabooga
|
ef10ffc6b4
|
Add various checks to model loading functions
|
2023-05-17 16:14:54 -03:00 |
|
oobabooga
|
abd361b3a0
|
Minor change
|
2023-05-17 11:33:43 -03:00 |
|
oobabooga
|
21ecc3701e
|
Avoid a name conflict
|
2023-05-17 11:23:13 -03:00 |
|
oobabooga
|
1a8151a2b6
|
Add AutoGPTQ support (basic) (#2132)
|
2023-05-17 11:12:12 -03:00 |
|
oobabooga
|
7584d46c29
|
Refactor models.py (#2113)
|
2023-05-16 19:52:22 -03:00 |
|
oobabooga
|
4e66f68115
|
Create get_max_memory_dict() function
|
2023-05-15 19:38:27 -03:00 |
|
oobabooga
|
2eeb27659d
|
Fix bug in --cpu-memory
|
2023-05-12 06:17:07 -03:00 |
|
oobabooga
|
3316e33d14
|
Remove unused code
|
2023-05-10 11:59:59 -03:00 |
|
oobabooga
|
3913155c1f
|
Style improvements (#1957)
|
2023-05-09 22:49:39 -03:00 |
|
Wesley Pyburn
|
a2b25322f0
|
Fix trust_remote_code in wrong location (#1953)
|
2023-05-09 19:22:10 -03:00 |
|
EgrorBs
|
d3ea70f453
|
More trust_remote_code=trust_remote_code (#1899)
|
2023-05-07 23:48:20 -03:00 |
|
oobabooga
|
97a6a50d98
|
Use oasst tokenizer instead of universal tokenizer
|
2023-05-04 15:55:39 -03:00 |
|
Mylo
|
bd531c2dc2
|
Make --trust-remote-code work for all models (#1772)
|
2023-05-04 02:01:28 -03:00 |
|
oobabooga
|
9c77ab4fc2
|
Improve some warnings
|
2023-05-03 22:06:46 -03:00 |
|
oobabooga
|
95d04d6a8d
|
Better warning messages
|
2023-05-03 21:43:17 -03:00 |
|
Ahmed Said
|
fbcd32988e
|
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative (#1649)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-02 18:25:28 -03:00 |
|
oobabooga
|
9c2e7c0fab
|
Fix path on models.py
|
2023-04-26 03:29:09 -03:00 |
|
oobabooga
|
a8409426d7
|
Fix bug in models.py
|
2023-04-26 01:55:40 -03:00 |
|
oobabooga
|
f642135517
|
Make universal tokenizer, xformers, sdp-attention apply to monkey patch
|
2023-04-25 23:18:11 -03:00 |
|
Vincent Brouwers
|
92cdb4f22b
|
Seq2Seq support (including FLAN-T5) (#1535)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-04-25 22:39:04 -03:00 |
|
Wojtab
|
12212cf6be
|
LLaVA support (#1487)
|
2023-04-23 20:32:22 -03:00 |
|
oobabooga
|
c0b5c09860
|
Minor change
|
2023-04-22 15:15:31 -03:00 |
|
oobabooga
|
fcb594b90e
|
Don't require llama.cpp models to be placed in subfolders
|
2023-04-22 14:56:48 -03:00 |
|
oobabooga
|
c4f4f41389
|
Add an "Evaluate" tab to calculate the perplexities of models (#1322)
|
2023-04-21 00:20:33 -03:00 |
|
oobabooga
|
7bb9036ac9
|
Add universal LLaMA tokenizer support
|
2023-04-19 21:23:51 -03:00 |
|
catalpaaa
|
07de7d0426
|
Load llamacpp before quantized model (#1307)
|
2023-04-17 10:47:26 -03:00 |
|
oobabooga
|
39099663a0
|
Add 4-bit LoRA support (#1200)
|
2023-04-16 23:26:52 -03:00 |
|
Forkoz
|
c6fe1ced01
|
Add ChatGLM support (#1256)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-04-16 19:15:03 -03:00 |
|
oobabooga
|
ac189011cb
|
Add "Save current settings for this model" button
|
2023-04-15 12:54:02 -03:00 |
|
oobabooga
|
cacbcda208
|
Two new options: truncation length and ban eos token
|
2023-04-11 18:46:06 -03:00 |
|
oobabooga
|
1911504f82
|
Minor bug fix
|
2023-04-09 23:45:41 -03:00 |
|
oobabooga
|
dba2000d2b
|
Do things that I am not proud of
|
2023-04-09 23:40:49 -03:00 |
|
MarkovInequality
|
992663fa20
|
Added xformers support to Llama (#950)
|
2023-04-09 23:08:40 -03:00 |
|
oobabooga
|
a3085dba07
|
Fix LlamaTokenizer eos_token (attempt)
|
2023-04-09 21:19:39 -03:00 |
|
oobabooga
|
0b458bf82d
|
Simplify a function
|
2023-04-07 21:37:41 -03:00 |
|
Φφ
|
ffd102e5c0
|
SD Api Pics extension, v.1.1 (#596)
|
2023-04-07 21:36:04 -03:00 |
|
oobabooga
|
ea6e77df72
|
Make the code more like PEP8 for readability (#862)
|
2023-04-07 00:15:45 -03:00 |
|
oobabooga
|
113f94b61e
|
Bump transformers (16-bit llama must be reconverted/redownloaded)
|
2023-04-06 16:04:03 -03:00 |
|
oobabooga
|
03cb44fc8c
|
Add new llama.cpp library (2048 context, temperature, etc now work)
|
2023-04-06 13:12:14 -03:00 |
|
catalpaaa
|
4ab679480e
|
allow quantized model to be loaded from model dir (#760)
|
2023-04-04 23:19:38 -03:00 |
|
oobabooga
|
3a47a602a3
|
Detect ggml*.bin files automatically
|
2023-03-31 17:18:21 -03:00 |
|
oobabooga
|
4c27562157
|
Minor changes
|
2023-03-31 14:33:46 -03:00 |
|
Thomas Antony
|
79fa2b6d7e
|
Add support for alpaca
|
2023-03-30 11:23:04 +01:00 |
|
Thomas Antony
|
7745faa7bb
|
Add llamacpp to models.py
|
2023-03-30 11:22:37 +01:00 |
|
oobabooga
|
1cb9246160
|
Adapt to the new model names
|
2023-03-29 21:47:36 -03:00 |
|
oobabooga
|
53da672315
|
Fix FlexGen
|
2023-03-27 23:44:21 -03:00 |
|
oobabooga
|
ee95e55df6
|
Fix RWKV tokenizer
|
2023-03-27 23:42:29 -03:00 |
|
oobabooga
|
fde92048af
|
Merge branch 'main' into catalpaaa-lora-and-model-dir
|
2023-03-27 23:16:44 -03:00 |
|
oobabooga
|
49c10c5570
|
Add support for the latest GPTQ models with group-size (#530)
**Warning: old 4-bit weights will not work anymore!**
See here how to get up to date weights: https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#step-2-get-the-pre-converted-weights
|
2023-03-26 00:11:33 -03:00 |
|
catalpaaa
|
b37c54edcf
|
lora-dir, model-dir and login auth
Added lora-dir, model-dir, and a login auth arguments that points to a file contains usernames and passwords in the format of "u:pw,u:pw,..."
|
2023-03-24 17:30:18 -07:00 |
|
oobabooga
|
a6bf54739c
|
Revert models.py (accident)
|
2023-03-24 19:56:45 -03:00 |
|
oobabooga
|
a80aa65986
|
Update models.py
|
2023-03-24 19:53:20 -03:00 |
|
oobabooga
|
ddb62470e9
|
--no-cache and --gpu-memory in MiB for fine VRAM control
|
2023-03-19 19:21:41 -03:00 |
|
oobabooga
|
e26763a510
|
Minor changes
|
2023-03-17 22:56:46 -03:00 |
|
Wojtek Kowaluk
|
7994b580d5
|
clean up duplicated code
|
2023-03-18 02:27:26 +01:00 |
|
Wojtek Kowaluk
|
30939e2aee
|
add mps support on apple silicon
|
2023-03-18 00:56:23 +01:00 |
|
oobabooga
|
ee164d1821
|
Don't split the layers in 8-bit mode by default
|
2023-03-16 18:22:16 -03:00 |
|
oobabooga
|
e085cb4333
|
Small changes
|
2023-03-16 13:34:23 -03:00 |
|
awoo
|
83cb20aad8
|
Add support for --gpu-memory witn --load-in-8bit
|
2023-03-16 18:42:53 +03:00 |
|
oobabooga
|
1c378965e1
|
Remove unused imports
|
2023-03-16 10:18:34 -03:00 |
|
oobabooga
|
66256ac1dd
|
Make the "no GPU has been detected" message more descriptive
|
2023-03-15 19:31:27 -03:00 |
|
oobabooga
|
265ba384b7
|
Rename a file, add deprecation warning for --load-in-4bit
|
2023-03-14 07:56:31 -03:00 |
|
Ayanami Rei
|
8778b756e6
|
use updated load_quantized
|
2023-03-13 22:11:40 +03:00 |
|
Ayanami Rei
|
e1c952c41c
|
make argument non case-sensitive
|
2023-03-13 20:22:38 +03:00 |
|
Ayanami Rei
|
3c9afd5ca3
|
rename method
|
2023-03-13 20:14:40 +03:00 |
|
Ayanami Rei
|
edbc61139f
|
use new quant loader
|
2023-03-13 20:00:38 +03:00 |
|