oobabooga
0aee7341d8
Properly count tokens/s for llama.cpp in chat mode
2023-03-31 17:04:32 -03:00
oobabooga
09b0a3aafb
Add repetition_penalty
2023-03-31 14:45:17 -03:00
oobabooga
9d1dcf880a
General improvements
2023-03-31 14:27:01 -03:00
Thomas Antony
a5f5736e74
Add to text_generation.py
2023-03-30 11:22:38 +01:00
oobabooga
1cb9246160
Adapt to the new model names
2023-03-29 21:47:36 -03:00
oobabooga
48a6c9513e
Merge pull request #572 from clusterfudge/issues/571
...
Potential fix for issues/571
2023-03-27 14:06:38 -03:00
oobabooga
af65c12900
Change Stop button behavior
2023-03-27 13:23:59 -03:00
Sean Fitzgerald
0bac80d9eb
Potential fix for issues/571
2023-03-25 13:08:45 -07:00
Forkoz
b740c5b284
Add display of context when input was generated
...
Not sure if I did this right but it does move with the conversation and seems to match value.
2023-03-24 08:56:07 -05:00
oobabooga
4578e88ffd
Stop the bot from talking for you in chat mode
2023-03-23 21:38:20 -03:00
oobabooga
bfa81e105e
Fix FlexGen streaming
2023-03-23 00:22:14 -03:00
oobabooga
de6a09dc7f
Properly separate the original prompt from the reply
2023-03-23 00:12:40 -03:00
wywywywy
61346b88ea
Add "seed" menu in the Parameters tab
2023-03-22 15:40:20 -03:00
oobabooga
45b7e53565
Only catch proper Exceptions in the text generation function
2023-03-20 20:36:02 -03:00
oobabooga
75a7a84ef2
Exception handling ( #454 )
...
* Update text_generation.py
* Update extensions.py
2023-03-20 13:36:52 -03:00
oobabooga
ddb62470e9
--no-cache and --gpu-memory in MiB for fine VRAM control
2023-03-19 19:21:41 -03:00
oobabooga
e26763a510
Minor changes
2023-03-17 22:56:46 -03:00
Wojtek Kowaluk
30939e2aee
add mps support on apple silicon
2023-03-18 00:56:23 +01:00
oobabooga
a577fb1077
Keep GALACTICA special tokens ( #300 )
2023-03-16 00:46:59 -03:00
oobabooga
cf2da86352
Prevent *Is typing* from disappearing instantly while streaming
2023-03-15 12:51:13 -03:00
oobabooga
9d6a625bd6
Add 'hallucinations' filter #326
...
This breaks the API since a new parameter has been added.
It should be a one-line fix. See api-example.py.
2023-03-15 11:10:35 -03:00
oobabooga
afc5339510
Remove "eval" statements from text generation functions
2023-03-14 16:04:17 -03:00
oobabooga
0c224cf4f4
Fix GALACTICA ( #285 )
2023-03-13 10:32:28 -03:00
oobabooga
b9e0712b92
Fix Open Assistant
2023-03-12 23:58:25 -03:00
oobabooga
1ddcd4d0ba
Clean up silero_tts
...
This should only be used with --no-stream.
The shared.still_streaming implementation was faulty by design:
output_modifier should never be called when streaming is already over.
2023-03-12 23:42:49 -03:00
oobabooga
c7aa51faa6
Use a list of eos_tokens instead of just a number
...
This might be the cause of LLaMA ramblings that some people have experienced.
2023-03-12 14:54:58 -03:00
Xan
b3e10e47c0
Fix merge conflict in text_generation
...
- Need to update `shared.still_streaming = False` before the final `yield formatted_outputs`, shifted the position of some yields.
2023-03-12 18:56:35 +11:00
oobabooga
341e135036
Various fixes in chat mode
2023-03-12 02:53:08 -03:00
oobabooga
b0e8cb8c88
Various fixes in chat mode
2023-03-12 02:31:45 -03:00
oobabooga
0bd5430988
Use 'with' statement to better handle streaming memory
2023-03-12 02:04:28 -03:00
oobabooga
37f0166b2d
Fix memory leak in new streaming (second attempt)
2023-03-11 23:14:49 -03:00
oobabooga
59b5f7a4b7
Improve usage of stopping_criteria
2023-03-08 12:13:40 -03:00
oobabooga
add9330e5e
Bug fixes
2023-03-08 11:26:29 -03:00
Xan
5648a41a27
Merge branch 'main' of https://github.com/xanthousm/text-generation-webui
2023-03-08 22:08:54 +11:00
Xan
ad6b699503
Better TTS with autoplay
...
- Adds "still_streaming" to shared module for extensions to know if generation is complete
- Changed TTS extension with new options:
- Show text under the audio widget
- Automatically play the audio once text generation finishes
- manage the generated wav files (only keep files for finished generations, optional max file limit)
- [wip] ability to change voice pitch and speed
- added 'tensorboard' to requirements, since python sent "tensorboard not found" errors after a fresh installation.
2023-03-08 22:02:17 +11:00
oobabooga
33fb6aed74
Minor bug fix
2023-03-08 03:08:16 -03:00
oobabooga
ad2970374a
Readability improvements
2023-03-08 03:00:06 -03:00
oobabooga
72d539dbff
Better separate the FlexGen case
2023-03-08 02:54:47 -03:00
oobabooga
ab50f80542
New text streaming method (much faster)
2023-03-08 02:46:35 -03:00
oobabooga
8e89bc596b
Fix encode() for RWKV
2023-03-07 23:15:46 -03:00
oobabooga
19a34941ed
Add proper streaming to RWKV
2023-03-07 18:17:56 -03:00
oobabooga
8660227e1b
Add top_k to RWKV
2023-03-07 17:24:28 -03:00
oobabooga
20bd645f6a
Fix bug in multigpu setups (attempt 3)
2023-03-06 15:58:18 -03:00
oobabooga
09a7c36e1b
Minor improvement while running custom models
2023-03-06 15:36:35 -03:00
oobabooga
24c4c20391
Fix bug in multigpu setups (attempt #2 )
2023-03-06 15:23:29 -03:00
oobabooga
d88b7836c6
Fix bug in multigpu setups
2023-03-06 14:58:30 -03:00
oobabooga
e91f4bc25a
Add RWKV tokenizer
2023-03-06 08:45:49 -03:00
oobabooga
a54b91af77
Improve readability
2023-03-05 10:21:15 -03:00
oobabooga
8e706df20e
Fix a memory leak when text streaming is on
2023-03-05 10:12:43 -03:00
oobabooga
c33715ad5b
Move towards HF LLaMA implementation
2023-03-05 01:20:31 -03:00
oobabooga
c93f1fa99b
Count the tokens more conservatively
2023-03-04 03:10:21 -03:00
oobabooga
05e703b4a4
Print the performance information more reliably
2023-03-03 21:24:32 -03:00
oobabooga
a345a2acd2
Add a tokenizer placeholder
2023-03-03 15:16:55 -03:00
oobabooga
5b354817f6
Make chat minimally work with LLaMA
2023-03-03 15:04:41 -03:00
oobabooga
ea5c5eb3da
Add LLaMA support
2023-03-03 14:39:14 -03:00
oobabooga
7bbe32f618
Don't return a value in an iterator function
2023-03-02 00:48:46 -03:00
oobabooga
ff9f649c0c
Remove some unused imports
2023-03-02 00:36:20 -03:00
oobabooga
955cf431e8
Minor consistency fix
2023-03-01 19:11:26 -03:00
oobabooga
831ac7ed3f
Add top_p
2023-03-01 16:45:48 -03:00
oobabooga
7c4d5ca8cc
Improve the text generation call a bit
2023-03-01 16:40:25 -03:00
oobabooga
0f6708c471
Sort the imports
2023-03-01 12:18:17 -03:00
oobabooga
e735806c51
Add a generate() function for RWKV
2023-03-01 12:16:11 -03:00
oobabooga
f871971de1
Trying to get the chat to work
2023-02-28 00:25:30 -03:00
oobabooga
ebd698905c
Add streaming to RWKV
2023-02-28 00:04:04 -03:00
oobabooga
70e522732c
Move RWKV loader into a separate file
2023-02-27 23:50:16 -03:00
oobabooga
ebc64a408c
RWKV support prototype
2023-02-27 23:03:35 -03:00
oobabooga
6e843a11d6
Fix FlexGen in chat mode
2023-02-26 00:36:04 -03:00
oobabooga
fa58fd5559
Proper way to free the cuda cache
2023-02-25 15:50:29 -03:00
oobabooga
700311ce40
Empty the cuda cache at model.generate()
2023-02-25 14:39:13 -03:00
oobabooga
78ad55641b
Remove duplicate max_new_tokens parameter
2023-02-24 17:19:42 -03:00
oobabooga
65326b545a
Move all gradio elements to shared (so that extensions can use them)
2023-02-24 16:46:50 -03:00
oobabooga
9ae063e42b
Fix softprompts when deepspeed is active ( #112 )
2023-02-23 20:22:47 -03:00
oobabooga
7224343a70
Improve the imports
2023-02-23 14:41:42 -03:00
oobabooga
1dacd34165
Further refactor
2023-02-23 13:28:30 -03:00
oobabooga
ce7feb3641
Further refactor
2023-02-23 13:03:52 -03:00