Andrei
e657dd342d
Add in-memory cache support for llama.cpp ( #1936 )
2023-05-15 20:19:55 -03:00
Jakub Strnad
0227e738ed
Add settings UI for llama.cpp and fixed reloading of llama.cpp models ( #2087 )
2023-05-15 19:51:23 -03:00
oobabooga
c07215cc08
Improve the default Assistant character
2023-05-15 19:39:08 -03:00
oobabooga
4e66f68115
Create get_max_memory_dict() function
2023-05-15 19:38:27 -03:00
AlphaAtlas
071f0776ad
Add llama.cpp GPU offload option ( #2060 )
2023-05-14 22:58:11 -03:00
oobabooga
3b886f9c9f
Add chat-instruct mode ( #2049 )
2023-05-14 10:43:55 -03:00
oobabooga
df37ba5256
Update impersonate_wrapper
2023-05-12 12:59:48 -03:00
oobabooga
e283ddc559
Change how spaces are handled in continue/generation attempts
2023-05-12 12:50:29 -03:00
oobabooga
2eeb27659d
Fix bug in --cpu-memory
2023-05-12 06:17:07 -03:00
oobabooga
5eaa914e1b
Fix settings.json being ignored because of config.yaml
2023-05-12 06:09:45 -03:00
oobabooga
71693161eb
Better handle spaces in LlamaTokenizer
2023-05-11 17:55:50 -03:00
oobabooga
7221d1389a
Fix a bug
2023-05-11 17:11:10 -03:00
oobabooga
0d36c18f5d
Always return only the new tokens in generation functions
2023-05-11 17:07:20 -03:00
oobabooga
394bb253db
Syntax improvement
2023-05-11 16:27:50 -03:00
oobabooga
f7dbddfff5
Add a variable for tts extensions to use
2023-05-11 16:12:46 -03:00
oobabooga
638c6a65a2
Refactor chat functions ( #2003 )
2023-05-11 15:37:04 -03:00
oobabooga
b7a589afc8
Improve the Metharme prompt
2023-05-10 16:09:32 -03:00
oobabooga
b01c4884cb
Better stopping strings for instruct mode
2023-05-10 14:22:38 -03:00
oobabooga
6a4783afc7
Add markdown table rendering
2023-05-10 13:41:23 -03:00
oobabooga
3316e33d14
Remove unused code
2023-05-10 11:59:59 -03:00
Alexander Dibrov
ec14d9b725
Fix custom_generate_chat_prompt
( #1965 )
2023-05-10 11:29:59 -03:00
oobabooga
32481ec4d6
Fix prompt order in the dropdown
2023-05-10 02:24:09 -03:00
oobabooga
dfd9ba3e90
Remove duplicate code
2023-05-10 02:07:22 -03:00
oobabooga
bdf1274b5d
Remove duplicate code
2023-05-10 01:34:04 -03:00
oobabooga
3913155c1f
Style improvements ( #1957 )
2023-05-09 22:49:39 -03:00
minipasila
334486f527
Added instruct-following template for Metharme ( #1679 )
2023-05-09 22:29:22 -03:00
Carl Kenner
814f754451
Support for MPT, INCITE, WizardLM, StableLM, Galactica, Vicuna, Guanaco, and Baize instruction following ( #1596 )
2023-05-09 20:37:31 -03:00
Wojtab
e9e75a9ec7
Generalize multimodality (llava/minigpt4 7b and 13b now supported) ( #1741 )
2023-05-09 20:18:02 -03:00
Wesley Pyburn
a2b25322f0
Fix trust_remote_code in wrong location ( #1953 )
2023-05-09 19:22:10 -03:00
LaaZa
218bd64bd1
Add the option to not automatically load the selected model ( #1762 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-09 15:52:35 -03:00
Maks
cf6caf1830
Make the RWKV model cache the RNN state between messages ( #1354 )
2023-05-09 11:12:53 -03:00
Kamil Szurant
641500dcb9
Use current input for Impersonate (continue impersonate feature) ( #1147 )
2023-05-09 02:37:42 -03:00
IJumpAround
020fe7b50b
Remove mutable defaults from function signature. ( #1663 )
2023-05-08 22:55:41 -03:00
Matthew McAllister
d78b04f0b4
Add error message when GPTQ-for-LLaMa import fails ( #1871 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-08 22:29:09 -03:00
oobabooga
68dcbc7ebd
Fix chat history handling in instruct mode
2023-05-08 16:41:21 -03:00
Clay Shoaf
79ac94cc2f
fixed LoRA loading issue ( #1865 )
2023-05-08 16:21:55 -03:00
oobabooga
b5260b24f1
Add support for custom chat styles ( #1917 )
2023-05-08 12:35:03 -03:00
EgrorBs
d3ea70f453
More trust_remote_code=trust_remote_code ( #1899 )
2023-05-07 23:48:20 -03:00
oobabooga
56a5969658
Improve the separation between instruct/chat modes ( #1896 )
2023-05-07 23:47:02 -03:00
oobabooga
9754d6a811
Fix an error message
2023-05-07 17:44:05 -03:00
camenduru
ba65a48ec8
trust_remote_code=shared.args.trust_remote_code ( #1891 )
2023-05-07 17:42:44 -03:00
oobabooga
6b67cb6611
Generalize superbooga to chat mode
2023-05-07 15:05:26 -03:00
oobabooga
56f6b7052a
Sort dropdowns numerically
2023-05-05 23:14:56 -03:00
oobabooga
8aafb1f796
Refactor text_generation.py, add support for custom generation functions ( #1817 )
2023-05-05 18:53:03 -03:00
oobabooga
c728f2b5f0
Better handle new line characters in code blocks
2023-05-05 11:22:36 -03:00
oobabooga
00e333d790
Add MOSS support
2023-05-04 23:20:34 -03:00
oobabooga
f673f4a4ca
Change --verbose behavior
2023-05-04 15:56:06 -03:00
oobabooga
97a6a50d98
Use oasst tokenizer instead of universal tokenizer
2023-05-04 15:55:39 -03:00
oobabooga
b6ff138084
Add --checkpoint argument for GPTQ
2023-05-04 15:17:20 -03:00
Mylo
bd531c2dc2
Make --trust-remote-code work for all models ( #1772 )
2023-05-04 02:01:28 -03:00
oobabooga
0e6d17304a
Clearer syntax for instruction-following characters
2023-05-03 22:50:39 -03:00
oobabooga
9c77ab4fc2
Improve some warnings
2023-05-03 22:06:46 -03:00
oobabooga
057b1b2978
Add credits
2023-05-03 21:49:55 -03:00
oobabooga
95d04d6a8d
Better warning messages
2023-05-03 21:43:17 -03:00
oobabooga
f54256e348
Rename no_mmap to no-mmap
2023-05-03 09:50:31 -03:00
practicaldreamer
e3968f7dd0
Fix Training Pad Token ( #1678 )
...
Currently padding with 0 the character vs 0 the token id (<unk> in the case of llama)
2023-05-02 23:16:08 -03:00
Wojtab
80c2f25131
LLaVA: small fixes ( #1664 )
...
* change multimodal projector to the correct one
* remove reference to custom stopping strings from readme
* fix stopping strings if tokenizer extension adds/removes tokens
* add API example
* LLaVA 7B just dropped, add to readme that there is no support for it currently
2023-05-02 23:12:22 -03:00
oobabooga
4e09df4034
Only show extension in UI if it has an ui() function
2023-05-02 19:20:02 -03:00
Ahmed Said
fbcd32988e
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative ( #1649 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-05-02 18:25:28 -03:00
Carl Kenner
2f1a2846d1
Verbose should always print special tokens in input ( #1707 )
2023-05-02 01:24:56 -03:00
Alex "mcmonkey" Goodwin
0df0b2d0f9
optimize stopping strings processing ( #1625 )
2023-05-02 01:21:54 -03:00
oobabooga
c83210c460
Move the rstrips
2023-04-26 17:17:22 -03:00
oobabooga
1d8b8222e9
Revert #1579 , apply the proper fix
...
Apparently models dislike trailing spaces.
2023-04-26 16:47:50 -03:00
oobabooga
9c2e7c0fab
Fix path on models.py
2023-04-26 03:29:09 -03:00
oobabooga
a777c058af
Precise prompts for instruct mode
2023-04-26 03:21:53 -03:00
oobabooga
a8409426d7
Fix bug in models.py
2023-04-26 01:55:40 -03:00
oobabooga
f642135517
Make universal tokenizer, xformers, sdp-attention apply to monkey patch
2023-04-25 23:18:11 -03:00
oobabooga
f39c99fa14
Load more than one LoRA with --lora, fix a bug
2023-04-25 22:58:48 -03:00
oobabooga
15940e762e
Fix missing initial space for LlamaTokenizer
2023-04-25 22:47:23 -03:00
Vincent Brouwers
92cdb4f22b
Seq2Seq support (including FLAN-T5) ( #1535 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-25 22:39:04 -03:00
Alex "mcmonkey" Goodwin
312cb7dda6
LoRA trainer improvements part 5 ( #1546 )
...
* full dynamic model type support on modern peft
* remove shuffle option
2023-04-25 21:27:30 -03:00
oobabooga
9b272bc8e5
Monkey patch fixes
2023-04-25 21:20:26 -03:00
oobabooga
da812600f4
Apply settings regardless of setup() function
2023-04-25 01:16:23 -03:00
da3dsoul
ebca3f86d5
Apply the settings for extensions after import, but before setup() ( #1484 )
2023-04-25 00:23:11 -03:00
oobabooga
b0ce750d4e
Add spaces
2023-04-25 00:10:21 -03:00
oobabooga
1a0c12c6f2
Refactor text-generation.py a bit
2023-04-24 19:24:12 -03:00
oobabooga
2f4f124132
Remove obsolete function
2023-04-24 13:27:24 -03:00
oobabooga
b6af2e56a2
Add --character flag, add character to settings.json
2023-04-24 13:19:42 -03:00
oobabooga
0c32ae27cc
Only load the default history if it's empty
2023-04-24 11:50:51 -03:00
eiery
78d1977ebf
add n_batch support for llama.cpp ( #1115 )
2023-04-24 03:46:18 -03:00
oobabooga
b1ee674d75
Make interface state (mostly) persistent on page reload
2023-04-24 03:05:47 -03:00
oobabooga
435f8cc0e7
Simplify some chat functions
2023-04-24 00:47:40 -03:00
Wojtab
12212cf6be
LLaVA support ( #1487 )
2023-04-23 20:32:22 -03:00
Andy Salerno
654933c634
New universal API with streaming/blocking endpoints ( #990 )
...
Previous title: Add api_streaming extension and update api-example-stream to use it
* Merge with latest main
* Add parameter capturing encoder_repetition_penalty
* Change some defaults, minor fixes
* Add --api, --public-api flags
* remove unneeded/broken comment from blocking API startup. The comment is already correctly emitted in try_start_cloudflared by calling the lambda we pass in.
* Update on_start message for blocking_api, it should say 'non-streaming' and not 'streaming'
* Update the API examples
* Change a comment
* Update README
* Remove the gradio API
* Remove unused import
* Minor change
* Remove unused import
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-23 15:52:43 -03:00
Alex "mcmonkey" Goodwin
459e725af9
Lora trainer docs ( #1493 )
2023-04-23 12:54:41 -03:00
oobabooga
c0b5c09860
Minor change
2023-04-22 15:15:31 -03:00
oobabooga
fcb594b90e
Don't require llama.cpp models to be placed in subfolders
2023-04-22 14:56:48 -03:00
oobabooga
7438f4f6ba
Change GPTQ triton default settings
2023-04-22 12:27:30 -03:00
USBhost
e1aa9d5173
Support upstream GPTQ once again. ( #1451 )
2023-04-21 12:43:56 -03:00
oobabooga
eddd016449
Minor deletion
2023-04-21 12:41:27 -03:00
oobabooga
d46b9b7c50
Fix evaluate comment saving
2023-04-21 12:34:08 -03:00
oobabooga
5e023ae64d
Change dropdown menu highlight color
2023-04-21 02:47:18 -03:00
oobabooga
c4f4f41389
Add an "Evaluate" tab to calculate the perplexities of models ( #1322 )
2023-04-21 00:20:33 -03:00
oobabooga
7bb9036ac9
Add universal LLaMA tokenizer support
2023-04-19 21:23:51 -03:00
Alex "mcmonkey" Goodwin
ee30625cd1
4-Bit LoRA training + several new training options and fixes
2023-04-19 19:39:03 -03:00
oobabooga
702fe92d42
Increase truncation_length_max value
2023-04-19 17:35:38 -03:00
oobabooga
9d9ae62938
Fix stopping strings in the gradio API
2023-04-19 13:52:21 -03:00
oobabooga
649e4017a5
Style improvements
2023-04-19 00:36:28 -03:00
oobabooga
000f65a2ef
Delete unused file
2023-04-18 04:01:14 -03:00
oobabooga
36f7c022f2
Rename a file
2023-04-18 01:38:33 -03:00
oobabooga
b069bb1f2e
Update monkey_patch_gradio.py
2023-04-18 01:32:42 -03:00
oobabooga
00186f76f4
Monkey patch gradio to prevent it from calling home
2023-04-18 01:13:16 -03:00
Tynan Burke
6a810b16b2
typo in training.py ( #1329 )
2023-04-17 21:40:46 -03:00
oobabooga
ac2973ffc6
Add a warning for --share
2023-04-17 19:34:28 -03:00
oobabooga
c544386824
Reset your name when choosing a character
2023-04-17 13:56:40 -03:00
oobabooga
c3dc348d1c
Don't show 'None' in the LoRA list
2023-04-17 13:52:23 -03:00
oobabooga
89bc540557
Update README
2023-04-17 10:55:35 -03:00
catalpaaa
07de7d0426
Load llamacpp before quantized model ( #1307 )
2023-04-17 10:47:26 -03:00
sgsdxzy
b57ffc2ec9
Update to support GPTQ triton commit c90adef ( #1229 )
2023-04-17 01:11:18 -03:00
oobabooga
39099663a0
Add 4-bit LoRA support ( #1200 )
2023-04-16 23:26:52 -03:00
oobabooga
46a8aa8c09
Readability
2023-04-16 21:26:19 -03:00
Forkoz
c6fe1ced01
Add ChatGLM support ( #1256 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 19:15:03 -03:00
oobabooga
6a03ad0824
Remove fix_newlines() calls from chat.py
2023-04-16 18:25:44 -03:00
oobabooga
5342f72968
Properly handle blockquote blocks
2023-04-16 18:00:12 -03:00
oobabooga
27f3a78834
Better detect when no model is loaded
2023-04-16 17:35:54 -03:00
oobabooga
c8ad960018
Add defaults to the gradio API
2023-04-16 17:33:28 -03:00
oobabooga
beb95f5fe2
Add a style for the "chat" mode
2023-04-16 16:44:50 -03:00
oobabooga
b937c9d8c2
Add skip_special_tokens checkbox for Dolly model ( #1218 )
2023-04-16 14:24:49 -03:00
oobabooga
b705b4210c
Minor changes to training.py
2023-04-16 03:08:37 -03:00
oobabooga
5c513a5f5c
Make training.py more readable
2023-04-16 02:46:27 -03:00
Alex "mcmonkey" Goodwin
a3eec62b50
Lora trainer improvements part 3 ( #1098 )
...
* add support for other model types
dependent on future-peft-changes but with fallback to function now
* use encoding=utf8 for training format
* make shuffling optional
and describe dropout a bit more
* add eval_steps to control evaluation
* make callbacks not depend on globals
* make save steps controllable
* placeholder of initial loading-existing-model support
and var name cleanup
* save/load parameters
* last bit of cleanup
* remove `gptq_bits` ref as main branch removed that setting
* add higher_rank_limit option
2048 is basically unreachable due to VRAM, but i trained at 1536 with batch size = 1 on a 7B model.
Note that it's in the do_train input just to save as a parameter
* fix math on save_steps
2023-04-16 02:35:13 -03:00
kernyan
ac19d5101f
revert incorrect eos_token_id change from #814 ( #1261 )
...
- fixes #1054
2023-04-16 01:47:01 -03:00
oobabooga
a2127239de
Fix a bug
2023-04-16 01:41:37 -03:00
oobabooga
9d3c6d2dc3
Fix a bug
2023-04-16 01:40:47 -03:00
Mikel Bober-Irizar
16a3a5b039
Merge pull request from GHSA-hv5m-3rp9-xcpf
...
* Remove eval of API input
* Remove unnecessary eval/exec for security
* Use ast.literal_eval
* Use ast.literal_eval
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-16 01:36:50 -03:00
oobabooga
d2ea925fa5
Bump llama-cpp-python to use LlamaCache
2023-04-16 00:53:40 -03:00
oobabooga
ac189011cb
Add "Save current settings for this model" button
2023-04-15 12:54:02 -03:00
oobabooga
abef355ed0
Remove deprecated flag
2023-04-15 01:21:19 -03:00
oobabooga
c3aa79118e
Minor generate_chat_prompt simplification
2023-04-14 23:02:08 -03:00
oobabooga
3a337cfded
Use argparse defaults
2023-04-14 15:35:06 -03:00
Alex "mcmonkey" Goodwin
64e3b44e0f
initial multi-lora support ( #1103 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-04-14 14:52:06 -03:00
oobabooga
1901d238e1
Minor change to API code
2023-04-14 12:11:47 -03:00
oobabooga
8e31f2bad4
Automatically set wbits/groupsize/instruct based on model name ( #1167 )
2023-04-14 11:07:28 -03:00
v0xie
9d66957207
Add --listen-host launch option ( #1122 )
2023-04-13 21:35:08 -03:00
oobabooga
a75e02de4d
Simplify GPTQ_loader.py
2023-04-13 12:13:07 -03:00
oobabooga
ca293bb713
Show a warning if two quantized models are found
2023-04-13 12:04:27 -03:00
oobabooga
8b482b4127
Merge #1073 from sgsdxzy/triton
...
* Multi-GPU support for triton
* Better quantized model filename detection
2023-04-13 11:31:21 -03:00
oobabooga
fde6d06167
Prioritize names with the groupsize in them
2023-04-13 11:27:03 -03:00
oobabooga
f2bf1a2c9e
Add some comments, remove obsolete code
2023-04-13 11:17:32 -03:00
Light
da74cd7c44
Generalized weight search path.
2023-04-13 21:43:32 +08:00
oobabooga
04866dc4fc
Add a warning for when no model is loaded
2023-04-13 10:35:08 -03:00
Light
cf58058c33
Change warmup_autotune to a negative switch.
2023-04-13 20:59:49 +08:00
Light
15d5a043f2
Merge remote-tracking branch 'origin/main' into triton
2023-04-13 19:38:51 +08:00
oobabooga
7dfbe54f42
Add --model-menu option
2023-04-12 21:24:26 -03:00
oobabooga
388038fb8e
Update settings-template.json
2023-04-12 18:30:43 -03:00
oobabooga
10e939c9b4
Merge branch 'main' of github.com:oobabooga/text-generation-webui
2023-04-12 17:21:59 -03:00
oobabooga
1566d8e344
Add model settings to the Models tab
2023-04-12 17:20:18 -03:00
Light
a405064ceb
Better dispatch.
2023-04-13 01:48:17 +08:00
Light
f3591ccfa1
Keep minimal change.
2023-04-12 23:26:06 +08:00
Lukas
5ad92c940e
lora training fixes: ( #970 )
...
Fix wrong input format being picked
Fix crash when an entry in the dataset has an attribute of value None
2023-04-12 11:38:01 -03:00