LarryVRH
|
580c1ee748
|
Implement a demo HF wrapper for exllama to utilize existing HF transformers decoding. (#2777)
|
2023-06-21 15:31:42 -03:00 |
|
oobabooga
|
e19cbea719
|
Add a variable to modules/shared.py
|
2023-06-17 19:02:29 -03:00 |
|
oobabooga
|
5f392122fd
|
Add gpu_split param to ExLlama
Adapted from code created by Ph0rk0z. Thank you Ph0rk0z.
|
2023-06-16 20:49:36 -03:00 |
|
oobabooga
|
9f40032d32
|
Add ExLlama support (#2444)
|
2023-06-16 20:35:38 -03:00 |
|
oobabooga
|
7ef6a50e84
|
Reorganize model loading UI completely (#2720)
|
2023-06-16 19:00:37 -03:00 |
|
Tom Jobbins
|
646b0c889f
|
AutoGPTQ: Add UI and command line support for disabling fused attention and fused MLP (#2648)
|
2023-06-15 23:59:54 -03:00 |
|
oobabooga
|
00b94847da
|
Remove softprompt support
|
2023-06-06 07:42:23 -03:00 |
|
oobabooga
|
3a5cfe96f0
|
Increase chat_prompt_size_max
|
2023-06-05 17:37:37 -03:00 |
|
oobabooga
|
f276d88546
|
Use AutoGPTQ by default for GPTQ models
|
2023-06-05 15:41:48 -03:00 |
|
oobabooga
|
19f78684e6
|
Add "Start reply with" feature to chat mode
|
2023-06-02 13:58:08 -03:00 |
|
LaaZa
|
9c066601f5
|
Extend AutoGPTQ support for any GPTQ model (#1668)
|
2023-06-02 01:33:55 -03:00 |
|
oobabooga
|
a83f9aa65b
|
Update shared.py
|
2023-06-01 12:08:39 -03:00 |
|
Honkware
|
204731952a
|
Falcon support (trust-remote-code and autogptq checkboxes) (#2367)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-29 10:20:18 -03:00 |
|
oobabooga
|
2f811b1bdf
|
Change a warning message
|
2023-05-28 22:48:20 -03:00 |
|
oobabooga
|
00ebea0b2a
|
Use YAML for presets and settings
|
2023-05-28 22:34:12 -03:00 |
|
oobabooga
|
8efdc01ffb
|
Better default for compute_dtype
|
2023-05-25 15:05:53 -03:00 |
|
DGdev91
|
cf088566f8
|
Make llama.cpp read prompt size and seed from settings (#2299)
|
2023-05-25 10:29:31 -03:00 |
|
oobabooga
|
361451ba60
|
Add --load-in-4bit parameter (#2320)
|
2023-05-25 01:14:13 -03:00 |
|
flurb18
|
d37a28730d
|
Beginning of multi-user support (#2262)
Adds a lock to generate_reply
|
2023-05-24 09:38:20 -03:00 |
|
Gabriel Terrien
|
7aed53559a
|
Support of the --gradio-auth flag (#2283)
|
2023-05-23 20:39:26 -03:00 |
|
oobabooga
|
cd3618d7fb
|
Add support for RWKV in Hugging Face format
|
2023-05-23 02:07:28 -03:00 |
|
Gabriel Terrien
|
0f51b64bb3
|
Add a "dark_theme" option to settings.json (#2288)
|
2023-05-22 19:45:11 -03:00 |
|
oobabooga
|
d63ef59a0f
|
Apply LLaMA-Precise preset to Vicuna by default
|
2023-05-21 23:00:42 -03:00 |
|
oobabooga
|
e116d31180
|
Prevent unwanted log messages from modules
|
2023-05-21 22:42:34 -03:00 |
|
oobabooga
|
1a8151a2b6
|
Add AutoGPTQ support (basic) (#2132)
|
2023-05-17 11:12:12 -03:00 |
|
Alex "mcmonkey" Goodwin
|
1f50dbe352
|
Experimental jank multiGPU inference that's 2x faster than native somehow (#2100)
|
2023-05-17 10:41:09 -03:00 |
|
atriantafy
|
26cf8c2545
|
add api port options (#1990)
|
2023-05-15 20:44:16 -03:00 |
|
Andrei
|
e657dd342d
|
Add in-memory cache support for llama.cpp (#1936)
|
2023-05-15 20:19:55 -03:00 |
|
oobabooga
|
c07215cc08
|
Improve the default Assistant character
|
2023-05-15 19:39:08 -03:00 |
|
AlphaAtlas
|
071f0776ad
|
Add llama.cpp GPU offload option (#2060)
|
2023-05-14 22:58:11 -03:00 |
|
oobabooga
|
3b886f9c9f
|
Add chat-instruct mode (#2049)
|
2023-05-14 10:43:55 -03:00 |
|
oobabooga
|
e283ddc559
|
Change how spaces are handled in continue/generation attempts
|
2023-05-12 12:50:29 -03:00 |
|
oobabooga
|
5eaa914e1b
|
Fix settings.json being ignored because of config.yaml
|
2023-05-12 06:09:45 -03:00 |
|
oobabooga
|
f7dbddfff5
|
Add a variable for tts extensions to use
|
2023-05-11 16:12:46 -03:00 |
|
oobabooga
|
bdf1274b5d
|
Remove duplicate code
|
2023-05-10 01:34:04 -03:00 |
|
minipasila
|
334486f527
|
Added instruct-following template for Metharme (#1679)
|
2023-05-09 22:29:22 -03:00 |
|
Carl Kenner
|
814f754451
|
Support for MPT, INCITE, WizardLM, StableLM, Galactica, Vicuna, Guanaco, and Baize instruction following (#1596)
|
2023-05-09 20:37:31 -03:00 |
|
Wojtab
|
e9e75a9ec7
|
Generalize multimodality (llava/minigpt4 7b and 13b now supported) (#1741)
|
2023-05-09 20:18:02 -03:00 |
|
LaaZa
|
218bd64bd1
|
Add the option to not automatically load the selected model (#1762)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-09 15:52:35 -03:00 |
|
oobabooga
|
b5260b24f1
|
Add support for custom chat styles (#1917)
|
2023-05-08 12:35:03 -03:00 |
|
oobabooga
|
00e333d790
|
Add MOSS support
|
2023-05-04 23:20:34 -03:00 |
|
oobabooga
|
b6ff138084
|
Add --checkpoint argument for GPTQ
|
2023-05-04 15:17:20 -03:00 |
|
oobabooga
|
95d04d6a8d
|
Better warning messages
|
2023-05-03 21:43:17 -03:00 |
|
oobabooga
|
f54256e348
|
Rename no_mmap to no-mmap
|
2023-05-03 09:50:31 -03:00 |
|
Ahmed Said
|
fbcd32988e
|
added no_mmap & mlock parameters to llama.cpp and removed llamacpp_model_alternative (#1649)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2023-05-02 18:25:28 -03:00 |
|
oobabooga
|
a777c058af
|
Precise prompts for instruct mode
|
2023-04-26 03:21:53 -03:00 |
|
oobabooga
|
f39c99fa14
|
Load more than one LoRA with --lora, fix a bug
|
2023-04-25 22:58:48 -03:00 |
|
oobabooga
|
b6af2e56a2
|
Add --character flag, add character to settings.json
|
2023-04-24 13:19:42 -03:00 |
|
eiery
|
78d1977ebf
|
add n_batch support for llama.cpp (#1115)
|
2023-04-24 03:46:18 -03:00 |
|
oobabooga
|
b1ee674d75
|
Make interface state (mostly) persistent on page reload
|
2023-04-24 03:05:47 -03:00 |
|