Ayanami Rei
1b99ed61bc
add argument --gptq-model-type and remove duplicate arguments
2023-03-13 20:01:34 +03:00
Ayanami Rei
edbc61139f
use new quant loader
2023-03-13 20:00:38 +03:00
Ayanami Rei
345b6dee8c
refactor quant models loader and add support of OPT
2023-03-13 19:59:57 +03:00
oobabooga
66b6971b61
Update README
2023-03-13 12:44:18 -03:00
oobabooga
ddea518e0f
Document --auto-launch
2023-03-13 12:43:33 -03:00
oobabooga
372363bc3d
Fix GPTQ load_quant call on Windows
2023-03-13 12:07:02 -03:00
oobabooga
0c224cf4f4
Fix GALACTICA ( #285 )
2023-03-13 10:32:28 -03:00
oobabooga
2c4699a7e9
Change a comment
2023-03-13 00:20:02 -03:00
oobabooga
0a7acb3bd9
Remove redundant comments
2023-03-13 00:12:21 -03:00
oobabooga
77294b27dd
Use str(Path) instead of os.path.abspath(Path)
2023-03-13 00:08:01 -03:00
oobabooga
b9e0712b92
Fix Open Assistant
2023-03-12 23:58:25 -03:00
oobabooga
1ddcd4d0ba
Clean up silero_tts
...
This should only be used with --no-stream.
The shared.still_streaming implementation was faulty by design:
output_modifier should never be called when streaming is already over.
2023-03-12 23:42:49 -03:00
HideLord
683556f411
Adding markdown support and slight refactoring.
2023-03-12 21:34:09 +02:00
oobabooga
cebe8b390d
Remove useless "substring_found" variable
2023-03-12 15:50:38 -03:00
oobabooga
4bcd675ccd
Add *Is typing...* to regenerate as well
2023-03-12 15:23:33 -03:00
oobabooga
c7aa51faa6
Use a list of eos_tokens instead of just a number
...
This might be the cause of LLaMA ramblings that some people have experienced.
2023-03-12 14:54:58 -03:00
oobabooga
d8bea766d7
Merge pull request #192 from xanthousm/main
...
Add text generation stream status to shared module, use for better TTS with auto-play
2023-03-12 13:40:16 -03:00
oobabooga
fda376d9c3
Use os.path.abspath() instead of str()
2023-03-12 12:41:04 -03:00
HideLord
8403152257
Fixing compatibility with GPTQ repo commit 2f667f7da051967566a5fb0546f8614bcd3a1ccd. Expects string and breaks on
2023-03-12 17:28:15 +02:00
oobabooga
f3b00dd165
Merge pull request #224 from ItsLogic/llama-bits
...
Allow users to load 2, 3 and 4 bit llama models
2023-03-12 11:23:50 -03:00
oobabooga
65dda28c9d
Rename --llama-bits to --gptq-bits
2023-03-12 11:19:07 -03:00
oobabooga
fed3617f07
Move LLaMA 4-bit into a separate file
2023-03-12 11:12:34 -03:00
oobabooga
0ac562bdba
Add a default prompt for OpenAssistant oasst-sft-1-pythia-12b #253
2023-03-12 10:46:16 -03:00
oobabooga
78901d522b
Remove unused imports
2023-03-12 08:59:05 -03:00
Xan
b3e10e47c0
Fix merge conflict in text_generation
...
- Need to update `shared.still_streaming = False` before the final `yield formatted_outputs`, shifted the position of some yields.
2023-03-12 18:56:35 +11:00
oobabooga
ad14f0e499
Fix regenerate (provisory way)
2023-03-12 03:42:29 -03:00
oobabooga
6e12068ba2
Merge pull request #258 from lxe/lxe/utf8
...
Load and save character files and chat history in UTF-8
2023-03-12 03:28:49 -03:00
oobabooga
e2da6b9685
Fix You You You appearing in chat mode
2023-03-12 03:25:56 -03:00
oobabooga
bcf0075278
Merge pull request #235 from xanthousm/Quality_of_life-main
...
--auto-launch and "Is typing..."
2023-03-12 03:12:56 -03:00
Aleksey Smolenchuk
3f7c3d6559
No need to set encoding on binary read
2023-03-11 22:10:57 -08:00
oobabooga
341e135036
Various fixes in chat mode
2023-03-12 02:53:08 -03:00
Aleksey Smolenchuk
3baf5fc700
Load and save chat history in utf-8
2023-03-11 21:40:01 -08:00
oobabooga
b0e8cb8c88
Various fixes in chat mode
2023-03-12 02:31:45 -03:00
unknown
433f6350bc
Load and save character files in UTF-8
2023-03-11 21:23:05 -08:00
oobabooga
0bd5430988
Use 'with' statement to better handle streaming memory
2023-03-12 02:04:28 -03:00
oobabooga
37f0166b2d
Fix memory leak in new streaming (second attempt)
2023-03-11 23:14:49 -03:00
oobabooga
92fe947721
Merge branch 'main' into new-streaming
2023-03-11 19:59:45 -03:00
oobabooga
2743dd736a
Add *Is typing...* to impersonate as well
2023-03-11 10:50:18 -03:00
Xan
96c51973f9
--auto-launch and "Is typing..."
...
- Added `--auto-launch` arg to open web UI in the default browser when ready.
- Changed chat.py to display user input immediately and "*Is typing...*" as a temporary reply while generating text. Most noticeable when using `--no-stream`.
2023-03-11 22:50:59 +11:00
Xan
33df4bd91f
Merge remote-tracking branch 'upstream/main'
2023-03-11 22:40:47 +11:00
draff
28fd4fc970
Change wording to be consistent with other args
2023-03-10 23:34:13 +00:00
draff
001e638b47
Make it actually work
2023-03-10 23:28:19 +00:00
draff
804486214b
Re-implement --load-in-4bit and update --llama-bits arg description
2023-03-10 23:21:01 +00:00
ItsLogic
9ba8156a70
remove unnecessary Path()
2023-03-10 22:33:58 +00:00
draff
e6c631aea4
Replace --load-in-4bit with --llama-bits
...
Replaces --load-in-4bit with a more flexible --llama-bits arg to allow for 2 and 3 bit models as well. This commit also fixes a loading issue with .pt files which are not in the root of the models folder
2023-03-10 21:36:45 +00:00
oobabooga
026d60bd34
Remove default preset that didn't do anything
2023-03-10 14:01:02 -03:00
oobabooga
e9dbdafb14
Merge branch 'main' into pt-path-changes
2023-03-10 11:03:42 -03:00
oobabooga
706a03b2cb
Minor changes
2023-03-10 11:02:25 -03:00
oobabooga
de7dd8b6aa
Add comments
2023-03-10 10:54:08 -03:00
oobabooga
e461c0b7a0
Move the import to the top
2023-03-10 10:51:12 -03:00
deepdiffuser
9fbd60bf22
add no_split_module_classes to prevent tensor split error
2023-03-10 05:30:47 -08:00
deepdiffuser
ab47044459
add multi-gpu support for 4bit gptq LLaMA
2023-03-10 04:52:45 -08:00
rohvani
2ac2913747
fix reference issue
2023-03-09 20:13:23 -08:00
rohvani
826e297b0e
add llama-65b-4bit support & multiple pt paths
2023-03-09 18:31:32 -08:00
oobabooga
9849aac0f1
Don't show .pt models in the list
2023-03-09 21:54:50 -03:00
oobabooga
74102d5ee4
Insert to the path instead of appending
2023-03-09 20:51:22 -03:00
oobabooga
2965aa1625
Check if the .pt file exists
2023-03-09 20:48:51 -03:00
oobabooga
828a524f9a
Add LLaMA 4-bit support
2023-03-09 15:50:26 -03:00
oobabooga
59b5f7a4b7
Improve usage of stopping_criteria
2023-03-08 12:13:40 -03:00
oobabooga
add9330e5e
Bug fixes
2023-03-08 11:26:29 -03:00
Xan
5648a41a27
Merge branch 'main' of https://github.com/xanthousm/text-generation-webui
2023-03-08 22:08:54 +11:00
Xan
ad6b699503
Better TTS with autoplay
...
- Adds "still_streaming" to shared module for extensions to know if generation is complete
- Changed TTS extension with new options:
- Show text under the audio widget
- Automatically play the audio once text generation finishes
- manage the generated wav files (only keep files for finished generations, optional max file limit)
- [wip] ability to change voice pitch and speed
- added 'tensorboard' to requirements, since python sent "tensorboard not found" errors after a fresh installation.
2023-03-08 22:02:17 +11:00
oobabooga
33fb6aed74
Minor bug fix
2023-03-08 03:08:16 -03:00
oobabooga
ad2970374a
Readability improvements
2023-03-08 03:00:06 -03:00
oobabooga
72d539dbff
Better separate the FlexGen case
2023-03-08 02:54:47 -03:00
oobabooga
0e16c0bacb
Remove redeclaration of a function
2023-03-08 02:50:49 -03:00
oobabooga
ab50f80542
New text streaming method (much faster)
2023-03-08 02:46:35 -03:00
oobabooga
8e89bc596b
Fix encode() for RWKV
2023-03-07 23:15:46 -03:00
oobabooga
19a34941ed
Add proper streaming to RWKV
2023-03-07 18:17:56 -03:00
oobabooga
8660227e1b
Add top_k to RWKV
2023-03-07 17:24:28 -03:00
oobabooga
153dfeb4dd
Add --rwkv-cuda-on parameter, bump rwkv version
2023-03-06 20:12:54 -03:00
oobabooga
6904a507c6
Change some parameters
2023-03-06 16:29:43 -03:00
oobabooga
20bd645f6a
Fix bug in multigpu setups (attempt 3)
2023-03-06 15:58:18 -03:00
oobabooga
09a7c36e1b
Minor improvement while running custom models
2023-03-06 15:36:35 -03:00
oobabooga
24c4c20391
Fix bug in multigpu setups (attempt #2 )
2023-03-06 15:23:29 -03:00
oobabooga
d88b7836c6
Fix bug in multigpu setups
2023-03-06 14:58:30 -03:00
oobabooga
5bed607b77
Increase repetition frequency/penalty for RWKV
2023-03-06 14:25:48 -03:00
oobabooga
bf56b6c1fb
Load settings.json without the need for --settings settings.json
...
This is for setting UI defaults
2023-03-06 10:57:45 -03:00
oobabooga
e91f4bc25a
Add RWKV tokenizer
2023-03-06 08:45:49 -03:00
oobabooga
c855b828fe
Better handle <USER>
2023-03-05 17:01:47 -03:00
oobabooga
2af66a4d4c
Fix <USER> in pygmalion replies
2023-03-05 16:08:50 -03:00
oobabooga
a54b91af77
Improve readability
2023-03-05 10:21:15 -03:00
oobabooga
8e706df20e
Fix a memory leak when text streaming is on
2023-03-05 10:12:43 -03:00
oobabooga
c33715ad5b
Move towards HF LLaMA implementation
2023-03-05 01:20:31 -03:00
oobabooga
bd8aac8fa4
Add LLaMA 8-bit support
2023-03-04 13:28:42 -03:00
oobabooga
c93f1fa99b
Count the tokens more conservatively
2023-03-04 03:10:21 -03:00
oobabooga
ed8b35efd2
Add --pin-weight parameter for FlexGen
2023-03-04 01:04:02 -03:00
oobabooga
05e703b4a4
Print the performance information more reliably
2023-03-03 21:24:32 -03:00
oobabooga
5a79863df3
Increase the sequence length, decrease batch size
...
I have no idea what I am doing
2023-03-03 15:54:13 -03:00
oobabooga
a345a2acd2
Add a tokenizer placeholder
2023-03-03 15:16:55 -03:00
oobabooga
5b354817f6
Make chat minimally work with LLaMA
2023-03-03 15:04:41 -03:00
oobabooga
ea5c5eb3da
Add LLaMA support
2023-03-03 14:39:14 -03:00
oobabooga
2bff646130
Stop chat from flashing dark when processing
2023-03-03 13:19:13 -03:00
oobabooga
169209805d
Model-aware prompts and presets
2023-03-02 11:25:04 -03:00
oobabooga
7bbe32f618
Don't return a value in an iterator function
2023-03-02 00:48:46 -03:00
oobabooga
ff9f649c0c
Remove some unused imports
2023-03-02 00:36:20 -03:00
oobabooga
1a05860ca3
Ensure proper no-streaming with generation_attempts > 1
2023-03-02 00:10:10 -03:00
oobabooga
a2a3e8f797
Add --rwkv-strategy parameter
2023-03-01 20:02:48 -03:00
oobabooga
449116a510
Fix RWKV paths on Windows (attempt)
2023-03-01 19:17:16 -03:00
oobabooga
955cf431e8
Minor consistency fix
2023-03-01 19:11:26 -03:00
oobabooga
f3da6dcc8f
Merge pull request #149 from oobabooga/RWKV
...
Add RWKV support
2023-03-01 16:57:45 -03:00
oobabooga
831ac7ed3f
Add top_p
2023-03-01 16:45:48 -03:00
oobabooga
7c4d5ca8cc
Improve the text generation call a bit
2023-03-01 16:40:25 -03:00
oobabooga
2f16ce309a
Rename a variable
2023-03-01 12:33:09 -03:00
oobabooga
9e9cfc4b31
Parameters
2023-03-01 12:19:37 -03:00
oobabooga
0f6708c471
Sort the imports
2023-03-01 12:18:17 -03:00
oobabooga
e735806c51
Add a generate() function for RWKV
2023-03-01 12:16:11 -03:00
oobabooga
659bb76722
Add RWKVModel class
2023-03-01 12:08:55 -03:00
oobabooga
9c86a1cd4a
Add RWKV pip package
2023-03-01 11:42:49 -03:00
oobabooga
6837d4d72a
Load the model by name
2023-02-28 02:52:29 -03:00
oobabooga
a1429d1607
Add default extensions to the settings
2023-02-28 02:20:11 -03:00
oobabooga
19ccb2aaf5
Handle <USER> and <BOT>
2023-02-28 01:05:43 -03:00
oobabooga
626da6c731
Handle {{user}} and {{char}} in example dialogue
2023-02-28 00:59:05 -03:00
oobabooga
e861e68e38
Move the chat example dialogue to the prompt
2023-02-28 00:50:46 -03:00
oobabooga
f871971de1
Trying to get the chat to work
2023-02-28 00:25:30 -03:00
oobabooga
67ee7bead7
Add cpu, bf16 options
2023-02-28 00:09:11 -03:00
oobabooga
ebd698905c
Add streaming to RWKV
2023-02-28 00:04:04 -03:00
oobabooga
70e522732c
Move RWKV loader into a separate file
2023-02-27 23:50:16 -03:00
oobabooga
ebc64a408c
RWKV support prototype
2023-02-27 23:03:35 -03:00
oobabooga
021bd55886
Better format the prompt when generation attempts > 1
2023-02-27 21:37:03 -03:00
oobabooga
43b6ab8673
Store thumbnails as files instead of base64 strings
...
This improves the UI responsiveness for large histories.
2023-02-27 13:41:00 -03:00
oobabooga
f24b6e78a3
Fix clear history
2023-02-26 23:58:04 -03:00
oobabooga
8e3e8a070f
Make FlexGen work with the newest API
2023-02-26 16:53:41 -03:00
oobabooga
3333f94c30
Make the gallery extension work on colab
2023-02-26 12:37:26 -03:00
oobabooga
633a2b6be2
Don't regenerate/remove last message if the chat is empty
2023-02-26 00:43:12 -03:00
oobabooga
6e843a11d6
Fix FlexGen in chat mode
2023-02-26 00:36:04 -03:00
oobabooga
4548227fb5
Downgrade gradio version (file uploads are broken in 3.19.1)
2023-02-25 22:59:02 -03:00
oobabooga
9456c1d6ed
Prevent streaming with no_stream + generation attempts > 1
2023-02-25 17:45:03 -03:00
oobabooga
32f40f3b42
Bump gradio version to 3.19.1
2023-02-25 17:20:03 -03:00
oobabooga
fa58fd5559
Proper way to free the cuda cache
2023-02-25 15:50:29 -03:00
oobabooga
b585e382c0
Rename the custom prompt generator function
2023-02-25 15:13:14 -03:00
oobabooga
700311ce40
Empty the cuda cache at model.generate()
2023-02-25 14:39:13 -03:00
oobabooga
1878acd9f3
Minor bug fix in chat
2023-02-25 09:30:59 -03:00
oobabooga
e71ff959f5
Clean up some unused code
2023-02-25 09:23:02 -03:00
oobabooga
91f5852245
Move bot_picture.py inside the extension
2023-02-25 03:00:19 -03:00
oobabooga
5ac24b019e
Minor fix in the extensions implementation
2023-02-25 02:53:18 -03:00
oobabooga
85f914b9b9
Disable the hijack after using it
2023-02-25 02:36:01 -03:00
oobabooga
7e9f13e29f
Rename a variable
2023-02-25 01:55:32 -03:00
oobabooga
1741c36092
Minor fix
2023-02-25 01:47:25 -03:00
oobabooga
7c2babfe39
Rename greed to "generation attempts"
2023-02-25 01:42:19 -03:00
oobabooga
2dfb999bf1
Add greed parameter
2023-02-25 01:31:01 -03:00
oobabooga
13f2688134
Better way to generate custom prompts
2023-02-25 01:08:17 -03:00
oobabooga
67623a52b7
Allow for permanent hijacking
2023-02-25 00:55:19 -03:00
oobabooga
111b5d42e7
Add prompt hijack option for extensions
2023-02-25 00:49:18 -03:00
oobabooga
7a527a5581
Move "send picture" into an extension
...
I am not proud of how I did it for now.
2023-02-25 00:23:51 -03:00
oobabooga
e51ece21c0
Add ui() function to extensions
2023-02-24 19:00:11 -03:00
oobabooga
78ad55641b
Remove duplicate max_new_tokens parameter
2023-02-24 17:19:42 -03:00
oobabooga
65326b545a
Move all gradio elements to shared (so that extensions can use them)
2023-02-24 16:46:50 -03:00
oobabooga
0817fe1beb
Move code back into the chatbot wrapper
2023-02-24 14:10:32 -03:00
oobabooga
8a7563ae84
Reorder the imports
2023-02-24 12:42:43 -03:00
oobabooga
ace74a557a
Add some comments
2023-02-24 12:41:27 -03:00
oobabooga
fe5057f932
Simplify the extensions implementation
2023-02-24 10:01:21 -03:00
oobabooga
2fb6ae6970
Move chat preprocessing into a separate function
2023-02-24 09:40:48 -03:00
oobabooga
f6f792363b
Separate command-line params by spaces instead of commas
2023-02-24 08:55:09 -03:00
oobabooga
e260e84e5a
Merge branch 'max_memory' of https://github.com/elwolf6/text-generation-webui into elwolf6-max_memory
2023-02-24 08:47:01 -03:00
oobabooga
146f786c57
Reorganize a bit
2023-02-24 08:44:54 -03:00
oobabooga
c2f4c395b9
Clean up some chat functions
2023-02-24 08:31:30 -03:00
luis
5abdc99a7c
gpu-memory arg change
2023-02-23 18:43:55 -05:00
oobabooga
9ae063e42b
Fix softprompts when deepspeed is active ( #112 )
2023-02-23 20:22:47 -03:00
oobabooga
dac6fe0ff4
Reset the history if no default history exists on reload
2023-02-23 19:53:50 -03:00
oobabooga
3b8cecbab7
Reload the default chat on page refresh
2023-02-23 19:50:23 -03:00
oobabooga
f1914115d3
Fix minor issue with chat logs
2023-02-23 16:04:47 -03:00
oobabooga
b78561fba6
Minor bug fix
2023-02-23 15:26:41 -03:00
oobabooga
2e86a1ec04
Move chat history into shared module
2023-02-23 15:11:18 -03:00
oobabooga
c87800341c
Move function to extensions module
2023-02-23 14:55:21 -03:00
oobabooga
2048b403a5
Reorder functions
2023-02-23 14:49:02 -03:00
oobabooga
7224343a70
Improve the imports
2023-02-23 14:41:42 -03:00
oobabooga
e46c43afa6
Move some stuff from server.py to modules
2023-02-23 13:42:23 -03:00
oobabooga
1dacd34165
Further refactor
2023-02-23 13:28:30 -03:00
oobabooga
ce7feb3641
Further refactor
2023-02-23 13:03:52 -03:00
oobabooga
98af4bfb0d
Refactor the code to make it more modular
2023-02-23 12:05:25 -03:00
oobabooga
bc856eb962
Add some more margin
2023-02-20 20:49:21 -03:00
oobabooga
f867285e3d
Make the circle a bit less red
2023-02-20 18:41:38 -03:00
oobabooga
e4440cd984
Make highlighted text gray in cai-chat mode
2023-02-20 16:43:32 -03:00
oobabooga
995bcfcf5e
Minor style change
2023-02-18 22:14:57 -03:00
oobabooga
d58544a420
Some minor formatting changes
2023-02-18 11:07:55 -03:00
oobabooga
3e6a8ccdce
Fix galactica latex css
2023-02-18 00:18:39 -03:00
oobabooga
14f49bbe9a
Fix galactica equations in dark mode
2023-02-17 23:57:09 -03:00
oobabooga
abb4667b44
Improve basic HTML style
2023-02-17 23:08:34 -03:00
oobabooga
00ca17abc9
Minor change
2023-02-17 22:52:03 -03:00
oobabooga
2fd003c044
Fix gpt4chan styles that were broken by gradio 3.18.0
2023-02-17 22:47:41 -03:00
oobabooga
0dd41e4830
Reorganize the sliders some more
2023-02-17 16:33:27 -03:00
oobabooga
6b9ac2f88e
Reorganize the generation parameters
2023-02-17 16:18:01 -03:00
oobabooga
3923ac967f
Create a cache for profile pictures (in RAM)
...
This is a performance optimization.
2023-02-17 14:30:39 -03:00
oobabooga
a6ddbbfc77
Add more fonts options
2023-02-17 11:30:04 -03:00
oobabooga
5eeb3f4e54
Make thumbnails for the profile pictures (for performance)
2023-02-17 10:58:54 -03:00
oobabooga
71c2764516
Fix the API docs in chat mode
2023-02-17 01:56:51 -03:00
oobabooga
33ad21c4f2
Make the profile pictures a bit larger
2023-02-17 00:35:17 -03:00
oobabooga
c4e87c109e
Include the bot's image as base64
...
This is needed for Colab.
2023-02-17 00:24:27 -03:00
oobabooga
aeddf902ec
Make the refresh button prettier
2023-02-16 21:55:20 -03:00
oobabooga
3746d72853
More style fixes
2023-02-15 21:13:12 -03:00
oobabooga
a55e8836f6
Bump gradio version
...
It looks uglier, but the old one was bugged and unstable.
2023-02-15 20:20:56 -03:00
oobabooga
1622059179
Move BLIP to the CPU
...
It's just as fast
2023-02-15 00:03:19 -03:00
oobabooga
8c3ef58e00
Use BLIP directly + some simplifications
2023-02-14 23:55:46 -03:00
SillyLossy
a7d98f494a
Use BLIP to send a picture to model
2023-02-15 01:38:21 +02:00
oobabooga
56bbc996a4
Minor CSS change for readability
2023-02-13 23:01:14 -03:00
oobabooga
61aed97439
Slightly increase a margin
2023-02-12 17:38:54 -03:00
oobabooga
76d3d7ddb3
Reorder the imports here too
2023-02-10 15:57:55 -03:00
oobabooga
d038963193
Rename a variable (for #59 )
2023-02-07 23:26:02 -03:00
oobabooga
f38c9bf428
Fix deepspeed (oops)
2023-02-02 10:39:37 -03:00