text-generation-webui

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2024-11-25 09:19:23 +01:00

Author	SHA1	Message	Date
oobabooga	a687f950ba	Remove the tensorcores llama.cpp wheels They are not faster than the default wheels anymore and they use a lot of space.	2024-07-22 11:54:35 -07:00
oobabooga	e9d4bff7d0	Update the --tensor_split description	2024-07-20 22:04:48 -07:00
Alberto Cano	a14c510afb	Customize the subpath for gradio, use with reverse proxy (#5106 )	2024-07-20 19:10:39 -03:00
oobabooga	aa7c14a463	Use chat-instruct mode by default	2024-07-19 21:43:52 -07:00
oobabooga	e436d69e2b	Add --no_xformers and --no_sdpa flags for ExllamaV2	2024-07-11 15:47:37 -07:00
GralchemOz	8a39f579d8	transformers: Add eager attention option to make Gemma-2 work properly (#6188 )	2024-07-01 12:08:08 -03:00
oobabooga	577a8cd3ee	Add TensorRT-LLM support (#5715 )	2024-06-24 02:30:03 -03:00
oobabooga	bd7cc4234d	Backend cleanup (#6025 )	2024-05-21 13:32:02 -03:00
oobabooga	9f77ed1b98	--idle-timeout flag to unload the model if unused for N minutes (#6026 )	2024-05-19 23:29:39 -03:00
oobabooga	e61055253c	Bump llama-cpp-python to 0.2.69, add --flash-attn option	2024-05-03 04:31:22 -07:00
oobabooga	51fb766bea	Add back my llama-cpp-python wheels, bump to 0.2.65 (#5964 )	2024-04-30 09:11:31 -03:00
oobabooga	70845c76fb	Add back the max_updates_second parameter (#5937 )	2024-04-26 10:14:51 -03:00
oobabooga	9b623b8a78	Bump llama-cpp-python to 0.2.64, use official wheels (#5921 )	2024-04-23 23:17:05 -03:00
oobabooga	cbd65ba767	Add a simple min_p preset, make it the default (#5836 )	2024-04-09 12:50:16 -03:00
oobabooga	168a0f4f67	UI: do not load the "gallery" extension by default	2024-04-06 12:43:21 -07:00
oobabooga	d423021a48	Remove CTransformers support (#5807 )	2024-04-04 20:23:58 -03:00
oobabooga	2a92a842ce	Bump gradio to 4.23 (#5758 )	2024-03-26 16:32:20 -03:00
oobabooga	28076928ac	UI: Add a new "User description" field for user personality/biography (#5691 )	2024-03-11 23:41:57 -03:00
oobabooga	056717923f	Document StreamingLLM	2024-03-10 19:15:23 -07:00
oobabooga	afb51bd5d6	Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669 )	2024-03-09 00:25:33 -03:00
Bartowski	104573f7d4	Update cache_4bit documentation (#5649 ) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>	2024-03-07 13:08:21 -03:00
oobabooga	2ec1d96c91	Add cache_4bit option for ExLlamaV2 (#5645 )	2024-03-06 23:02:25 -03:00
oobabooga	2174958362	Revert gradio to 3.50.2 (#5640 )	2024-03-06 11:52:46 -03:00
oobabooga	63a1d4afc8	Bump gradio to 4.19 (#5522 )	2024-03-05 07:32:28 -03:00
oobabooga	a6730f88f7	Add --autosplit flag for ExLlamaV2 (#5524 )	2024-02-16 15:26:10 -03:00
oobabooga	76d28eaa9e	Add a menu for customizing the instruction template for the model (#5521 )	2024-02-16 14:21:17 -03:00
oobabooga	080f7132c0	Revert gradio to 3.50.2 (#5513 )	2024-02-15 20:40:23 -03:00
oobabooga	7123ac3f77	Remove "Maximum UI updates/second" parameter (#5507 )	2024-02-14 23:34:30 -03:00
oobabooga	acfbe6b3b3	Minor doc changes	2024-02-06 06:35:01 -08:00
oobabooga	8a6d9abb41	Small fixes	2024-02-06 06:26:27 -08:00
oobabooga	2a1063eff5	Revert "Remove non-HF ExLlamaV2 loader (#5431 )" This reverts commit `cde000d478`.	2024-02-06 06:21:36 -08:00
oobabooga	8c35fefb3b	Add custom sampler order support (#5443 )	2024-02-06 11:20:10 -03:00
Forkoz	2a45620c85	Split by rows instead of layers for llama.cpp multi-gpu (#5435 )	2024-02-04 23:36:40 -03:00
oobabooga	cde000d478	Remove non-HF ExLlamaV2 loader (#5431 )	2024-02-04 01:15:51 -03:00
oobabooga	e055967974	Add prompt_lookup_num_tokens parameter (#5296 )	2024-01-17 17:09:36 -03:00
oobabooga	53dc1d8197	UI: Do not save unchanged settings to settings.yaml	2024-01-09 18:59:04 -08:00
oobabooga	2aad91f3c9	Remove deprecated command-line flags (#5131 )	2023-12-31 02:07:48 -03:00
oobabooga	2734ce3e4c	Remove RWKV loader (#5130 )	2023-12-31 02:01:40 -03:00
oobabooga	0e54a09bcb	Remove exllamav1 loaders (#5128 )	2023-12-31 01:57:06 -03:00
oobabooga	8e397915c9	Remove --sdp-attention, --xformers flags (#5126 )	2023-12-31 01:36:51 -03:00
oobabooga	8c60495878	UI: add "Maximum UI updates/second" parameter	2023-12-24 09:17:40 -08:00
oobabooga	2706149c65	Organize the CMD arguments by group (#5027 )	2023-12-21 00:33:55 -03:00
oobabooga	9992f7d8c0	Improve several log messages	2023-12-19 20:54:32 -08:00
oobabooga	de138b8ba6	Add llama-cpp-python wheels with tensor cores support (#5003 )	2023-12-19 17:30:53 -03:00
oobabooga	0a299d5959	Bump llama-cpp-python to 0.2.24 (#5001 )	2023-12-19 15:22:21 -03:00
oobabooga	a23a004434	Update the example template	2023-12-18 17:47:35 -08:00
Water	674be9a09a	Add HQQ quant loader (#4888 ) --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>	2023-12-18 21:23:16 -03:00
oobabooga	f1f2c4c3f4	Add --num_experts_per_token parameter (ExLlamav2) (#4955 )	2023-12-17 12:08:33 -03:00
oobabooga	3bbf6c601d	AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this)	2023-12-15 06:46:13 -08:00
oobabooga	1c531a3713	Minor cleanup	2023-12-12 13:25:21 -08:00

1 2 3 4 5 ...

290 Commits