oobabooga
|
eb30f4441f
|
Add ExLlama+LoRA support (#2756)
|
2023-06-19 12:31:24 -03:00 |
|
oobabooga
|
5f418f6171
|
Fix a memory leak (credits for the fix: Ph0rk0z)
|
2023-06-19 01:19:28 -03:00 |
|
ThisIsPIRI
|
def3b69002
|
Fix loading condition for universal llama tokenizer (#2753)
|
2023-06-18 18:14:06 -03:00 |
|
oobabooga
|
09c781b16f
|
Add modules/block_requests.py
This has become unnecessary, but it could be useful in the future
for other libraries.
|
2023-06-18 16:31:14 -03:00 |
|
Forkoz
|
3cae1221d4
|
Update exllama.py - Respect model dir parameter (#2744)
|
2023-06-18 13:26:30 -03:00 |
|
oobabooga
|
c5641b65d3
|
Handle leading spaces properly in ExLllama
|
2023-06-17 19:35:12 -03:00 |
|
oobabooga
|
05a743d6ad
|
Make llama.cpp use tfs parameter
|
2023-06-17 19:08:25 -03:00 |
|
oobabooga
|
e19cbea719
|
Add a variable to modules/shared.py
|
2023-06-17 19:02:29 -03:00 |
|
oobabooga
|
cbd63eeeff
|
Fix repeated tokens with exllama
|
2023-06-17 19:02:08 -03:00 |
|
oobabooga
|
766c760cd7
|
Use gen_begin_reuse in exllama
|
2023-06-17 18:00:10 -03:00 |
|
oobabooga
|
b27f83c0e9
|
Make exllama stoppable
|
2023-06-16 22:03:23 -03:00 |
|
oobabooga
|
7f06d551a3
|
Fix streaming callback
|
2023-06-16 21:44:56 -03:00 |
|
oobabooga
|
5f392122fd
|
Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
|
2023-06-16 20:49:36 -03:00 |
|
oobabooga
|
9f40032d32
|
Add ExLlama support (#2444)
|
2023-06-16 20:35:38 -03:00 |
|
oobabooga
|
dea43685b0
|
Add some clarifications
|
2023-06-16 19:10:53 -03:00 |
|
oobabooga
|
7ef6a50e84
|
Reorganize model loading UI completely (#2720)
|
2023-06-16 19:00:37 -03:00 |
|
Tom Jobbins
|
646b0c889f
|
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP (#2648)
|
2023-06-15 23:59:54 -03:00 |
|
oobabooga
|
2b9a6b9259
|
Merge remote-tracking branch 'refs/remotes/origin/main'
|
2023-06-14 18:45:24 -03:00 |
|
oobabooga
|
4d508cbe58
|
Add some checks to AutoGPTQ loader
|
2023-06-14 18:44:43 -03:00 |
|
FartyPants
|
56c19e623c
|
Add LORA name instead of "default" in PeftModel (#2689)
|
2023-06-14 18:29:42 -03:00 |
|
oobabooga
|
474dc7355a
|
Allow API requests to use parameter presets
|
2023-06-14 11:32:20 -03:00 |
|
oobabooga
|
e471919e6d
|
Make llava/minigpt-4 work with AutoGPTQ
|
2023-06-11 17:56:01 -03:00 |
|
oobabooga
|
f4defde752
|
Add a menu for installing extensions
|
2023-06-11 17:11:06 -03:00 |
|
oobabooga
|
ac122832f7
|
Make dropdown menus more similar to automatic1111
|
2023-06-11 14:20:16 -03:00 |
|
oobabooga
|
6133675e0f
|
Add menus for saving presets/characters/instruction templates/prompts (#2621)
|
2023-06-11 12:19:18 -03:00 |
|
brandonj60
|
b04e18d10c
|
Add Mirostat v2 sampling to transformer models (#2571)
|
2023-06-09 21:26:31 -03:00 |
|
oobabooga
|
6015616338
|
Style changes
|
2023-06-06 13:06:05 -03:00 |
|
oobabooga
|
f040073ef1
|
Handle the case of older autogptq install
|
2023-06-06 13:05:05 -03:00 |
|
oobabooga
|
bc58dc40bd
|
Fix a minor bug
|
2023-06-06 12:57:13 -03:00 |
|
oobabooga
|
00b94847da
|
Remove softprompt support
|
2023-06-06 07:42:23 -03:00 |
|
oobabooga
|
0aebc838a0
|
Don't save the history for 'None' character
|
2023-06-06 07:21:07 -03:00 |
|
oobabooga
|
9f215523e2
|
Remove some unused imports
|
2023-06-06 07:05:46 -03:00 |
|
oobabooga
|
0f0108ce34
|
Never load the history for default character
|
2023-06-06 07:00:11 -03:00 |
|
oobabooga
|
11f38b5c2b
|
Add AutoGPTQ LoRA support
|
2023-06-05 23:32:57 -03:00 |
|
oobabooga
|
3a5cfe96f0
|
Increase chat_prompt_size_max
|
2023-06-05 17:37:37 -03:00 |
|
oobabooga
|
f276d88546
|
Use AutoGPTQ by default for GPTQ models
|
2023-06-05 15:41:48 -03:00 |
|
oobabooga
|
9b0e95abeb
|
Fix "regenerate" when "Start reply with" is set
|
2023-06-05 11:56:03 -03:00 |
|
oobabooga
|
19f78684e6
|
Add "Start reply with" feature to chat mode
|
2023-06-02 13:58:08 -03:00 |
|
GralchemOz
|
f7b07c4705
|
Fix the missing Chinese character bug (#2497)
|
2023-06-02 13:45:41 -03:00 |
|
oobabooga
|
2f6631195a
|
Add desc_act checkbox to the UI
|
2023-06-02 01:45:46 -03:00 |
|
LaaZa
|
9c066601f5
|
Extend AutoGPTQ support for any GPTQ model (#1668)
|
2023-06-02 01:33:55 -03:00 |
|
oobabooga
|
a83f9aa65b
|
Update shared.py
|
2023-06-01 12:08:39 -03:00 |
|
oobabooga
|
b6c407f51d
|
Don't stream at more than 24 fps
This is a performance optimization
|
2023-05-31 23:41:42 -03:00 |
|
Forkoz
|
9ab90d8b60
|
Fix warning for qlora (#2438)
|
2023-05-30 11:09:18 -03:00 |
|
oobabooga
|
3578dd3611
|
Change a warning message
|
2023-05-29 22:40:54 -03:00 |
|
oobabooga
|
3a6e194bc7
|
Change a warning message
|
2023-05-29 22:39:23 -03:00 |
|
Luis Lopez
|
9e7204bef4
|
Add tail-free and top-a sampling (#2357)
|
2023-05-29 21:40:01 -03:00 |
|
oobabooga
|
1394f44e14
|
Add triton checkbox for AutoGPTQ
|
2023-05-29 15:32:45 -03:00 |
|
oobabooga
|
f34d20922c
|
Minor fix
|
2023-05-29 13:31:17 -03:00 |
|
oobabooga
|
983eef1e29
|
Attempt at evaluating falcon perplexity (failed)
|
2023-05-29 13:28:25 -03:00 |
|
Honkware
|
204731952a
|
Falcon support (trust-remote-code and autogptq checkboxes) (#2367)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-29 10:20:18 -03:00 |
|
Forkoz
|
60ae80cf28
|
Fix hang in tokenizer for AutoGPTQ llama models. (#2399)
|
2023-05-28 23:10:10 -03:00 |
|
oobabooga
|
2f811b1bdf
|
Change a warning message
|
2023-05-28 22:48:20 -03:00 |
|
oobabooga
|
9ee1e37121
|
Fix return message when no model is loaded
|
2023-05-28 22:46:32 -03:00 |
|
oobabooga
|
00ebea0b2a
|
Use YAML for presets and settings
|
2023-05-28 22:34:12 -03:00 |
|
oobabooga
|
acfd876f29
|
Some qol changes to "Perplexity evaluation"
|
2023-05-25 15:06:22 -03:00 |
|
oobabooga
|
8efdc01ffb
|
Better default for compute_dtype
|
2023-05-25 15:05:53 -03:00 |
|
oobabooga
|
37d4ad012b
|
Add a button for rendering markdown for any model
|
2023-05-25 11:59:27 -03:00 |
|
DGdev91
|
cf088566f8
|
Make llama.cpp read prompt size and seed from settings (#2299)
|
2023-05-25 10:29:31 -03:00 |
|
oobabooga
|
361451ba60
|
Add --load-in-4bit parameter (#2320)
|
2023-05-25 01:14:13 -03:00 |
|
oobabooga
|
63ce5f9c28
|
Add back a missing bos token
|
2023-05-24 13:54:36 -03:00 |
|
Alex "mcmonkey" Goodwin
|
3cd7c5bdd0
|
LoRA Trainer: train_only_after option to control which part of your input to train on (#2315)
|
2023-05-24 12:43:22 -03:00 |
|
flurb18
|
d37a28730d
|
Beginning of multi-user support (#2262)
Adds a lock to generate_reply
|
2023-05-24 09:38:20 -03:00 |
|
Gabriel Terrien
|
7aed53559a
|
Support of the --gradio-auth flag (#2283)
|
2023-05-23 20:39:26 -03:00 |
|
oobabooga
|
fb6a00f4e5
|
Small AutoGPTQ fix
|
2023-05-23 15:20:01 -03:00 |
|
oobabooga
|
cd3618d7fb
|
Add support for RWKV in Hugging Face format
|
2023-05-23 02:07:28 -03:00 |
|
oobabooga
|
75adc110d4
|
Fix "perplexity evaluation" progress messages
|
2023-05-23 01:54:52 -03:00 |
|
oobabooga
|
4d94a111d4
|
memoize load_character to speed up the chat API
|
2023-05-23 00:50:58 -03:00 |
|
Gabriel Terrien
|
0f51b64bb3
|
Add a "dark_theme" option to settings.json (#2288)
|
2023-05-22 19:45:11 -03:00 |
|
oobabooga
|
c0fd7f3257
|
Add mirostat parameters for llama.cpp (#2287)
|
2023-05-22 19:37:24 -03:00 |
|
oobabooga
|
d63ef59a0f
|
Apply LLaMA-Precise preset to Vicuna by default
|
2023-05-21 23:00:42 -03:00 |
|
oobabooga
|
dcc3e54005
|
Various "impersonate" fixes
|
2023-05-21 22:54:28 -03:00 |
|
oobabooga
|
e116d31180
|
Prevent unwanted log messages from modules
|
2023-05-21 22:42:34 -03:00 |
|
oobabooga
|
fb91406e93
|
Fix generation_attempts continuing after an empty reply
|
2023-05-21 22:14:50 -03:00 |
|
oobabooga
|
e18534fe12
|
Fix "continue" in chat-instruct mode
|
2023-05-21 22:05:59 -03:00 |
|
oobabooga
|
8ac3636966
|
Add epsilon_cutoff/eta_cutoff parameters (#2258)
|
2023-05-21 15:11:57 -03:00 |
|
oobabooga
|
1e5821bd9e
|
Fix silero tts autoplay (attempt #2)
|
2023-05-21 13:25:11 -03:00 |
|
oobabooga
|
a5d5bb9390
|
Fix silero tts autoplay
|
2023-05-21 12:11:59 -03:00 |
|
oobabooga
|
05593a7834
|
Minor bug fix
|
2023-05-20 23:22:36 -03:00 |
|
Matthew McAllister
|
ab6acddcc5
|
Add Save/Delete character buttons (#1870)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-20 21:48:45 -03:00 |
|
oobabooga
|
c5af549d4b
|
Add chat API (#2233)
|
2023-05-20 18:42:17 -03:00 |
|
Konstantin Gukov
|
1b52bddfcc
|
Mitigate UnboundLocalError (#2136)
|
2023-05-19 14:46:18 -03:00 |
|
Alex "mcmonkey" Goodwin
|
50c70e28f0
|
Lora Trainer improvements, part 6 - slightly better raw text inputs (#2108)
|
2023-05-19 12:58:54 -03:00 |
|
oobabooga
|
9d5025f531
|
Improve error handling while loading GPTQ models
|
2023-05-19 11:20:08 -03:00 |
|
oobabooga
|
b667ffa51d
|
Simplify GPTQ_loader.py
|
2023-05-17 16:22:56 -03:00 |
|
oobabooga
|
ef10ffc6b4
|
Add various checks to model loading functions
|
2023-05-17 16:14:54 -03:00 |
|
oobabooga
|
abd361b3a0
|
Minor change
|
2023-05-17 11:33:43 -03:00 |
|
oobabooga
|
21ecc3701e
|
Avoid a name conflict
|
2023-05-17 11:23:13 -03:00 |
|
oobabooga
|
fb91c07191
|
Minor bug fix
|
2023-05-17 11:16:37 -03:00 |
|
oobabooga
|
1a8151a2b6
|
Add AutoGPTQ support (basic) (#2132)
|
2023-05-17 11:12:12 -03:00 |
|
Alex "mcmonkey" Goodwin
|
1f50dbe352
|
Experimental jank multiGPU inference that's 2x faster than native somehow (#2100)
|
2023-05-17 10:41:09 -03:00 |
|
oobabooga
|
ce21804ec7
|
Allow extensions to define a new tab
|
2023-05-17 01:31:56 -03:00 |
|
oobabooga
|
a84f499718
|
Allow extensions to define custom CSS and JS
|
2023-05-17 00:30:54 -03:00 |
|
oobabooga
|
7584d46c29
|
Refactor models.py (#2113)
|
2023-05-16 19:52:22 -03:00 |
|
oobabooga
|
5cd6dd4287
|
Fix no-mmap bug
|
2023-05-16 17:35:49 -03:00 |
|
Forkoz
|
d205ec9706
|
Fix Training fails when evaluation dataset is selected (#2099)
Fixes https://github.com/oobabooga/text-generation-webui/issues/2078 from Googulator
|
2023-05-16 13:40:19 -03:00 |
|
atriantafy
|
26cf8c2545
|
add api port options (#1990)
|
2023-05-15 20:44:16 -03:00 |
|
Andrei
|
e657dd342d
|
Add in-memory cache support for llama.cpp (#1936)
|
2023-05-15 20:19:55 -03:00 |
|
Jakub Strnad
|
0227e738ed
|
Add settings UI for llama.cpp and fixed reloading of llama.cpp models (#2087)
|
2023-05-15 19:51:23 -03:00 |
|
oobabooga
|
c07215cc08
|
Improve the default Assistant character
|
2023-05-15 19:39:08 -03:00 |
|
oobabooga
|
4e66f68115
|
Create get_max_memory_dict() function
|
2023-05-15 19:38:27 -03:00 |
|
AlphaAtlas
|
071f0776ad
|
Add llama.cpp GPU offload option (#2060)
|
2023-05-14 22:58:11 -03:00 |
|
oobabooga
|
3b886f9c9f
|
Add chat-instruct mode (#2049)
|
2023-05-14 10:43:55 -03:00 |
|
oobabooga
|
df37ba5256
|
Update impersonate_wrapper
|
2023-05-12 12:59:48 -03:00 |
|
oobabooga
|
e283ddc559
|
Change how spaces are handled in continue/generation attempts
|
2023-05-12 12:50:29 -03:00 |
|
oobabooga
|
2eeb27659d
|
Fix bug in --cpu-memory
|
2023-05-12 06:17:07 -03:00 |
|
oobabooga
|
5eaa914e1b
|
Fix settings.json being ignored because of config.yaml
|
2023-05-12 06:09:45 -03:00 |
|
oobabooga
|
71693161eb
|
Better handle spaces in LlamaTokenizer
|
2023-05-11 17:55:50 -03:00 |
|
oobabooga
|
7221d1389a
|
Fix a bug
|
2023-05-11 17:11:10 -03:00 |
|
oobabooga
|
0d36c18f5d
|
Always return only the new tokens in generation functions
|
2023-05-11 17:07:20 -03:00 |
|
oobabooga
|
394bb253db
|
Syntax improvement
|
2023-05-11 16:27:50 -03:00 |
|
oobabooga
|
f7dbddfff5
|
Add a variable for tts extensions to use
|
2023-05-11 16:12:46 -03:00 |
|
oobabooga
|
638c6a65a2
|
Refactor chat functions (#2003)
|
2023-05-11 15:37:04 -03:00 |
|
oobabooga
|
b7a589afc8
|
Improve the Metharme prompt
|
2023-05-10 16:09:32 -03:00 |
|
oobabooga
|
b01c4884cb
|
Better stopping strings for instruct mode
|
2023-05-10 14:22:38 -03:00 |
|
oobabooga
|
6a4783afc7
|
Add markdown table rendering
|
2023-05-10 13:41:23 -03:00 |
|
oobabooga
|
3316e33d14
|
Remove unused code
|
2023-05-10 11:59:59 -03:00 |
|
Alexander Dibrov
|
ec14d9b725
|
Fix custom_generate_chat_prompt (#1965)
|
2023-05-10 11:29:59 -03:00 |
|
oobabooga
|
32481ec4d6
|
Fix prompt order in the dropdown
|
2023-05-10 02:24:09 -03:00 |
|
oobabooga
|
dfd9ba3e90
|
Remove duplicate code
|
2023-05-10 02:07:22 -03:00 |
|
oobabooga
|
bdf1274b5d
|
Remove duplicate code
|
2023-05-10 01:34:04 -03:00 |
|
oobabooga
|
3913155c1f
|
Style improvements (#1957)
|
2023-05-09 22:49:39 -03:00 |
|
minipasila
|
334486f527
|
Added instruct-following template for Metharme (#1679)
|
2023-05-09 22:29:22 -03:00 |
|
Carl Kenner
|
814f754451
|
Support for MPT, INCITE, WizardLM, StableLM, Galactica, Vicuna, Guanaco, and Baize instruction following (#1596)
|
2023-05-09 20:37:31 -03:00 |
|
Wojtab
|
e9e75a9ec7
|
Generalize multimodality (llava/minigpt4 7b and 13b now supported) (#1741)
|
2023-05-09 20:18:02 -03:00 |
|
Wesley Pyburn
|
a2b25322f0
|
Fix trust_remote_code in wrong location (#1953)
|
2023-05-09 19:22:10 -03:00 |
|
LaaZa
|
218bd64bd1
|
Add the option to not automatically load the selected model (#1762)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-09 15:52:35 -03:00 |
|
Maks
|
cf6caf1830
|
Make the RWKV model cache the RNN state between messages (#1354)
|
2023-05-09 11:12:53 -03:00 |
|
Kamil Szurant
|
641500dcb9
|
Use current input for Impersonate (continue impersonate feature) (#1147)
|
2023-05-09 02:37:42 -03:00 |
|
IJumpAround
|
020fe7b50b
|
Remove mutable defaults from function signature. (#1663)
|
2023-05-08 22:55:41 -03:00 |
|
Matthew McAllister
|
d78b04f0b4
|
Add error message when GPTQ-for-LLaMa import fails (#1871)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-08 22:29:09 -03:00 |
|
oobabooga
|
68dcbc7ebd
|
Fix chat history handling in instruct mode
|
2023-05-08 16:41:21 -03:00 |
|
Clay Shoaf
|
79ac94cc2f
|
fixed LoRA loading issue (#1865)
|
2023-05-08 16:21:55 -03:00 |
|
oobabooga
|
b5260b24f1
|
Add support for custom chat styles (#1917)
|
2023-05-08 12:35:03 -03:00 |
|
EgrorBs
|
d3ea70f453
|
More trust_remote_code=trust_remote_code (#1899)
|
2023-05-07 23:48:20 -03:00 |
|
oobabooga
|
56a5969658
|
Improve the separation between instruct/chat modes (#1896)
|
2023-05-07 23:47:02 -03:00 |
|
oobabooga
|
9754d6a811
|
Fix an error message
|
2023-05-07 17:44:05 -03:00 |
|
camenduru
|
ba65a48ec8
|
trust_remote_code=shared.args.trust_remote_code (#1891)
|
2023-05-07 17:42:44 -03:00 |
|
oobabooga
|
6b67cb6611
|
Generalize superbooga to chat mode
|
2023-05-07 15:05:26 -03:00 |
|
oobabooga
|
56f6b7052a
|
Sort dropdowns numerically
|
2023-05-05 23:14:56 -03:00 |
|
oobabooga
|
8aafb1f796
|
Refactor text_generation.py, add support for custom generation functions (#1817)
|
2023-05-05 18:53:03 -03:00 |
|
oobabooga
|
c728f2b5f0
|
Better handle new line characters in code blocks
|
2023-05-05 11:22:36 -03:00 |
|
oobabooga
|
00e333d790
|
Add MOSS support
|
2023-05-04 23:20:34 -03:00 |
|
oobabooga
|
f673f4a4ca
|
Change --verbose behavior
|
2023-05-04 15:56:06 -03:00 |
|
oobabooga
|
97a6a50d98
|
Use oasst tokenizer instead of universal tokenizer
|
2023-05-04 15:55:39 -03:00 |
|
oobabooga
|
b6ff138084
|
Add --checkpoint argument for GPTQ
|
2023-05-04 15:17:20 -03:00 |
|
Mylo
|
bd531c2dc2
|
Make --trust-remote-code work for all models (#1772)
|
2023-05-04 02:01:28 -03:00 |
|
oobabooga
|
0e6d17304a
|
Clearer syntax for instruction-following characters
|
2023-05-03 22:50:39 -03:00 |
|
oobabooga
|
9c77ab4fc2
|
Improve some warnings
|
2023-05-03 22:06:46 -03:00 |
|
oobabooga
|
057b1b2978
|
Add credits
|
2023-05-03 21:49:55 -03:00 |
|