Commit Graph

572 Commits

Author SHA1 Message Date
oobabooga
f54256e348 Rename no_mmap to no-mmap 2023-05-03 09:50:31 -03:00
practicaldreamer
e3968f7dd0
Fix Training Pad Token (#1678)
Currently padding with 0 the character vs 0 the token id (<unk> in the case of llama)
2023-05-02 23:16:08 -03:00
Wojtab
80c2f25131
LLaVA: small fixes (#1664)
* change multimodal projector to the correct one

* remove reference to custom stopping strings from readme

* fix stopping strings if tokenizer extension adds/removes tokens

* add API example

* LLaVA 7B just dropped, add to readme that there is no support for it currently
2023-05-02 23:12:22 -03:00
oobabooga
4e09df4034 Only show extension in UI if it has an ui() function 2023-05-02 19:20:02 -03:00
Ahmed Said
fbcd32988e
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative (#1649)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-02 18:25:28 -03:00
Carl Kenner
2f1a2846d1
Verbose should always print special tokens in input (#1707) 2023-05-02 01:24:56 -03:00
Alex "mcmonkey" Goodwin
0df0b2d0f9
optimize stopping strings processing (#1625) 2023-05-02 01:21:54 -03:00
oobabooga
c83210c460 Move the rstrips 2023-04-26 17:17:22 -03:00
oobabooga
1d8b8222e9 Revert #1579, apply the proper fix
Apparently models dislike trailing spaces.
2023-04-26 16:47:50 -03:00
oobabooga
9c2e7c0fab Fix path on models.py 2023-04-26 03:29:09 -03:00
oobabooga
a777c058af
Precise prompts for instruct mode 2023-04-26 03:21:53 -03:00
oobabooga
a8409426d7
Fix bug in models.py 2023-04-26 01:55:40 -03:00
oobabooga
f642135517 Make universal tokenizer, xformers, sdp-attention apply to monkey patch 2023-04-25 23:18:11 -03:00
oobabooga
f39c99fa14 Load more than one LoRA with --lora, fix a bug 2023-04-25 22:58:48 -03:00
oobabooga
15940e762e Fix missing initial space for LlamaTokenizer 2023-04-25 22:47:23 -03:00
Vincent Brouwers
92cdb4f22b
Seq2Seq support (including FLAN-T5) (#1535)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-25 22:39:04 -03:00
Alex "mcmonkey" Goodwin
312cb7dda6
LoRA trainer improvements part 5 (#1546)
* full dynamic model type support on modern peft

* remove shuffle option
2023-04-25 21:27:30 -03:00
oobabooga
9b272bc8e5 Monkey patch fixes 2023-04-25 21:20:26 -03:00
oobabooga
da812600f4 Apply settings regardless of setup() function 2023-04-25 01:16:23 -03:00
da3dsoul
ebca3f86d5
Apply the settings for extensions after import, but before setup() (#1484) 2023-04-25 00:23:11 -03:00
oobabooga
b0ce750d4e Add spaces 2023-04-25 00:10:21 -03:00
oobabooga
1a0c12c6f2
Refactor text-generation.py a bit 2023-04-24 19:24:12 -03:00
oobabooga
2f4f124132 Remove obsolete function 2023-04-24 13:27:24 -03:00
oobabooga
b6af2e56a2 Add --character flag, add character to settings.json 2023-04-24 13:19:42 -03:00
oobabooga
0c32ae27cc Only load the default history if it's empty 2023-04-24 11:50:51 -03:00
eiery
78d1977ebf
add n_batch support for llama.cpp (#1115) 2023-04-24 03:46:18 -03:00
oobabooga
b1ee674d75 Make interface state (mostly) persistent on page reload 2023-04-24 03:05:47 -03:00
oobabooga
435f8cc0e7
Simplify some chat functions 2023-04-24 00:47:40 -03:00
Wojtab
12212cf6be
LLaVA support (#1487) 2023-04-23 20:32:22 -03:00
Andy Salerno
654933c634
New universal API with streaming/blocking endpoints (#990)
Previous title: Add api_streaming extension and update api-example-stream to use it

* Merge with latest main

* Add parameter capturing encoder_repetition_penalty

* Change some defaults, minor fixes

* Add --api, --public-api flags

* remove unneeded/broken comment from blocking API startup. The comment is already correctly emitted in try_start_cloudflared by calling the lambda we pass in.

* Update on_start message for blocking_api, it should say 'non-streaming' and not 'streaming'

* Update the API examples

* Change a comment

* Update README

* Remove the gradio API

* Remove unused import

* Minor change

* Remove unused import

---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-23 15:52:43 -03:00
Alex "mcmonkey" Goodwin
459e725af9
Lora trainer docs (#1493) 2023-04-23 12:54:41 -03:00
oobabooga
c0b5c09860 Minor change 2023-04-22 15:15:31 -03:00
oobabooga
fcb594b90e Don't require llama.cpp models to be placed in subfolders 2023-04-22 14:56:48 -03:00
oobabooga
7438f4f6ba Change GPTQ triton default settings 2023-04-22 12:27:30 -03:00
USBhost
e1aa9d5173
Support upstream GPTQ once again. (#1451) 2023-04-21 12:43:56 -03:00
oobabooga
eddd016449 Minor deletion 2023-04-21 12:41:27 -03:00
oobabooga
d46b9b7c50 Fix evaluate comment saving 2023-04-21 12:34:08 -03:00
oobabooga
5e023ae64d Change dropdown menu highlight color 2023-04-21 02:47:18 -03:00
oobabooga
c4f4f41389
Add an "Evaluate" tab to calculate the perplexities of models (#1322) 2023-04-21 00:20:33 -03:00
oobabooga
7bb9036ac9 Add universal LLaMA tokenizer support 2023-04-19 21:23:51 -03:00
Alex "mcmonkey" Goodwin
ee30625cd1
4-Bit LoRA training + several new training options and fixes 2023-04-19 19:39:03 -03:00
oobabooga
702fe92d42 Increase truncation_length_max value 2023-04-19 17:35:38 -03:00
oobabooga
9d9ae62938 Fix stopping strings in the gradio API 2023-04-19 13:52:21 -03:00
oobabooga
649e4017a5 Style improvements 2023-04-19 00:36:28 -03:00
oobabooga
000f65a2ef
Delete unused file 2023-04-18 04:01:14 -03:00
oobabooga
36f7c022f2
Rename a file 2023-04-18 01:38:33 -03:00
oobabooga
b069bb1f2e
Update monkey_patch_gradio.py 2023-04-18 01:32:42 -03:00
oobabooga
00186f76f4
Monkey patch gradio to prevent it from calling home 2023-04-18 01:13:16 -03:00
Tynan Burke
6a810b16b2
typo in training.py (#1329) 2023-04-17 21:40:46 -03:00
oobabooga
ac2973ffc6 Add a warning for --share 2023-04-17 19:34:28 -03:00
oobabooga
c544386824 Reset your name when choosing a character 2023-04-17 13:56:40 -03:00
oobabooga
c3dc348d1c Don't show 'None' in the LoRA list 2023-04-17 13:52:23 -03:00
oobabooga
89bc540557 Update README 2023-04-17 10:55:35 -03:00
catalpaaa
07de7d0426
Load llamacpp before quantized model (#1307) 2023-04-17 10:47:26 -03:00
sgsdxzy
b57ffc2ec9
Update to support GPTQ triton commit c90adef (#1229) 2023-04-17 01:11:18 -03:00
oobabooga
39099663a0
Add 4-bit LoRA support (#1200) 2023-04-16 23:26:52 -03:00
oobabooga
46a8aa8c09 Readability 2023-04-16 21:26:19 -03:00
Forkoz
c6fe1ced01
Add ChatGLM support (#1256)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 19:15:03 -03:00
oobabooga
6a03ad0824 Remove fix_newlines() calls from chat.py 2023-04-16 18:25:44 -03:00
oobabooga
5342f72968 Properly handle blockquote blocks 2023-04-16 18:00:12 -03:00
oobabooga
27f3a78834 Better detect when no model is loaded 2023-04-16 17:35:54 -03:00
oobabooga
c8ad960018 Add defaults to the gradio API 2023-04-16 17:33:28 -03:00
oobabooga
beb95f5fe2 Add a style for the "chat" mode 2023-04-16 16:44:50 -03:00
oobabooga
b937c9d8c2
Add skip_special_tokens checkbox for Dolly model (#1218) 2023-04-16 14:24:49 -03:00
oobabooga
b705b4210c Minor changes to training.py 2023-04-16 03:08:37 -03:00
oobabooga
5c513a5f5c Make training.py more readable 2023-04-16 02:46:27 -03:00
Alex "mcmonkey" Goodwin
a3eec62b50
Lora trainer improvements part 3 (#1098)
* add support for other model types

dependent on future-peft-changes but with fallback to function now

* use encoding=utf8 for training format

* make shuffling optional

and describe dropout a bit more

* add eval_steps to control evaluation

* make callbacks not depend on globals

* make save steps controllable

* placeholder of initial loading-existing-model support

and var name cleanup

* save/load parameters

* last bit of cleanup

* remove `gptq_bits` ref as main branch removed that setting

* add higher_rank_limit option

2048 is basically unreachable due to VRAM, but i trained at 1536 with batch size = 1 on a 7B model.
Note that it's in the do_train input just to save as a parameter

* fix math on save_steps
2023-04-16 02:35:13 -03:00
kernyan
ac19d5101f
revert incorrect eos_token_id change from #814 (#1261)
- fixes #1054
2023-04-16 01:47:01 -03:00
oobabooga
a2127239de Fix a bug 2023-04-16 01:41:37 -03:00
oobabooga
9d3c6d2dc3 Fix a bug 2023-04-16 01:40:47 -03:00
Mikel Bober-Irizar
16a3a5b039
Merge pull request from GHSA-hv5m-3rp9-xcpf
* Remove eval of API input

* Remove unnecessary eval/exec for security

* Use ast.literal_eval

* Use ast.literal_eval

---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 01:36:50 -03:00
oobabooga
d2ea925fa5 Bump llama-cpp-python to use LlamaCache 2023-04-16 00:53:40 -03:00
oobabooga
ac189011cb Add "Save current settings for this model" button 2023-04-15 12:54:02 -03:00
oobabooga
abef355ed0 Remove deprecated flag 2023-04-15 01:21:19 -03:00
oobabooga
c3aa79118e Minor generate_chat_prompt simplification 2023-04-14 23:02:08 -03:00
oobabooga
3a337cfded Use argparse defaults 2023-04-14 15:35:06 -03:00
Alex "mcmonkey" Goodwin
64e3b44e0f
initial multi-lora support (#1103)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-14 14:52:06 -03:00
oobabooga
1901d238e1 Minor change to API code 2023-04-14 12:11:47 -03:00
oobabooga
8e31f2bad4
Automatically set wbits/groupsize/instruct based on model name (#1167) 2023-04-14 11:07:28 -03:00
v0xie
9d66957207
Add --listen-host launch option (#1122) 2023-04-13 21:35:08 -03:00
oobabooga
a75e02de4d Simplify GPTQ_loader.py 2023-04-13 12:13:07 -03:00
oobabooga
ca293bb713 Show a warning if two quantized models are found 2023-04-13 12:04:27 -03:00
oobabooga
8b482b4127
Merge #1073 from sgsdxzy/triton
* Multi-GPU support for triton
* Better quantized model filename detection
2023-04-13 11:31:21 -03:00
oobabooga
fde6d06167 Prioritize names with the groupsize in them 2023-04-13 11:27:03 -03:00
oobabooga
f2bf1a2c9e Add some comments, remove obsolete code 2023-04-13 11:17:32 -03:00
Light
da74cd7c44 Generalized weight search path. 2023-04-13 21:43:32 +08:00
oobabooga
04866dc4fc Add a warning for when no model is loaded 2023-04-13 10:35:08 -03:00
Light
cf58058c33 Change warmup_autotune to a negative switch. 2023-04-13 20:59:49 +08:00
Light
15d5a043f2 Merge remote-tracking branch 'origin/main' into triton 2023-04-13 19:38:51 +08:00
oobabooga
7dfbe54f42 Add --model-menu option 2023-04-12 21:24:26 -03:00
oobabooga
388038fb8e Update settings-template.json 2023-04-12 18:30:43 -03:00
oobabooga
10e939c9b4 Merge branch 'main' of github.com:oobabooga/text-generation-webui 2023-04-12 17:21:59 -03:00
oobabooga
1566d8e344 Add model settings to the Models tab 2023-04-12 17:20:18 -03:00
Light
a405064ceb Better dispatch. 2023-04-13 01:48:17 +08:00
Light
f3591ccfa1 Keep minimal change. 2023-04-12 23:26:06 +08:00
Lukas
5ad92c940e
lora training fixes: (#970)
Fix wrong input format being picked
Fix crash when an entry in the dataset has an attribute of value None
2023-04-12 11:38:01 -03:00
oobabooga
80f4eabb2a Fix send_pictures extension 2023-04-12 10:27:06 -03:00
oobabooga
8265d45db8 Add send dummy message/reply buttons
Useful for starting a new reply.
2023-04-11 22:21:41 -03:00
oobabooga
37d52c96bc Fix Continue in chat mode 2023-04-11 21:46:17 -03:00
oobabooga
cacbcda208
Two new options: truncation length and ban eos token 2023-04-11 18:46:06 -03:00