Commit Graph

217 Commits

Author SHA1 Message Date
GralchemOz
4c74c7a116
Fix UnicodeDecodeError for BPE-based Models (especially GLM-4) (#6357) 2024-09-02 23:00:59 -03:00
oobabooga
9dcff21da9 Remove unnecessary shared.previous_model_name variable 2024-07-28 18:35:11 -07:00
oobabooga
577a8cd3ee
Add TensorRT-LLM support (#5715) 2024-06-24 02:30:03 -03:00
Belladore
46174a2d33
Fix error when bos_token_id is None. (#6061) 2024-06-12 22:52:27 -03:00
Belladore
a363cdfca1
Fix missing bos token for some models (including Llama-3) (#6050) 2024-05-27 09:21:30 -03:00
Philipp Emanuel Weidmann
852c943769
DRY: A modern repetition penalty that reliably prevents looping (#5677) 2024-05-19 23:53:47 -03:00
oobabooga
9f77ed1b98
--idle-timeout flag to unload the model if unused for N minutes (#6026) 2024-05-19 23:29:39 -03:00
oobabooga
a4611232b7 Make --verbose output less spammy 2024-05-18 09:57:00 -07:00
oobabooga
70845c76fb
Add back the max_updates_second parameter (#5937) 2024-04-26 10:14:51 -03:00
oobabooga
6761b5e7c6
Improved instruct style (with syntax highlighting & LaTeX rendering) (#5936) 2024-04-26 10:13:11 -03:00
wangshuai09
fd4e46bce2
Add Ascend NPU support (basic) (#5541) 2024-04-11 18:42:20 -03:00
oobabooga
d423021a48
Remove CTransformers support (#5807) 2024-04-04 20:23:58 -03:00
oobabooga
13fe38eb27 Remove specialized code for gpt-4chan 2024-04-04 16:11:47 -07:00
oobabooga
35da6b989d
Organize the parameters tab (#5767) 2024-03-28 16:45:03 -03:00
oobabooga
2a92a842ce
Bump gradio to 4.23 (#5758) 2024-03-26 16:32:20 -03:00
oobabooga
cf0697936a Optimize StreamingLLM by over 10x 2024-03-08 21:48:28 -08:00
oobabooga
afb51bd5d6
Add StreamingLLM for llamacpp & llamacpp_HF (2nd attempt) (#5669) 2024-03-09 00:25:33 -03:00
oobabooga
2174958362
Revert gradio to 3.50.2 (#5640) 2024-03-06 11:52:46 -03:00
oobabooga
63a1d4afc8
Bump gradio to 4.19 (#5522) 2024-03-05 07:32:28 -03:00
kalomaze
cfb25c9b3f
Cubic sampling w/ curve param (#5551)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-03-03 13:22:21 -03:00
oobabooga
080f7132c0
Revert gradio to 3.50.2 (#5513) 2024-02-15 20:40:23 -03:00
oobabooga
7123ac3f77
Remove "Maximum UI updates/second" parameter (#5507) 2024-02-14 23:34:30 -03:00
oobabooga
494cc3c5b0 Handle empty sampler priority field, use default values 2024-02-06 07:05:32 -08:00
oobabooga
2a1063eff5 Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478.
2024-02-06 06:21:36 -08:00
oobabooga
8c35fefb3b
Add custom sampler order support (#5443) 2024-02-06 11:20:10 -03:00
oobabooga
7073665a10
Truncate long chat completions inputs (#5439) 2024-02-05 02:31:24 -03:00
oobabooga
cde000d478
Remove non-HF ExLlamaV2 loader (#5431) 2024-02-04 01:15:51 -03:00
kalomaze
b6077b02e4
Quadratic sampling (#5403)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
lmg-anon
db1da9f98d
Fix logprobs tokens in OpenAI API (#5339) 2024-01-22 08:07:42 -03:00
oobabooga
e055967974
Add prompt_lookup_num_tokens parameter (#5296) 2024-01-17 17:09:36 -03:00
oobabooga
29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters (#5209) 2024-01-08 23:28:35 -03:00
oobabooga
0d07b3a6a1
Add dynamic_temperature_low parameter (#5198) 2024-01-07 17:03:47 -03:00
oobabooga
b8a0b3f925 Don't print torch tensors with --verbose 2024-01-07 10:35:55 -08:00
oobabooga
cf820c69c5 Print generation parameters with --verbose (HF only) 2024-01-07 10:06:23 -08:00
kalomaze
48327cc5c4
Dynamic Temperature HF loader support (#5174)
---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga
2734ce3e4c
Remove RWKV loader (#5130) 2023-12-31 02:01:40 -03:00
oobabooga
0e54a09bcb
Remove exllamav1 loaders (#5128) 2023-12-31 01:57:06 -03:00
oobabooga
8c60495878 UI: add "Maximum UI updates/second" parameter 2023-12-24 09:17:40 -08:00
zhangningboo
1b8b61b928
Fix output_ids decoding for Qwen/Qwen-7B-Chat (#5045) 2023-12-22 23:11:02 -03:00
oobabooga
83cf1a6b67 Fix Yi space issue (closes #4996) 2023-12-19 07:54:19 -08:00
oobabooga
12690d3ffc
Better HF grammar implementation (#4953) 2023-12-17 02:01:23 -03:00
oobabooga
8513028968 Fix lag in the chat tab during streaming 2023-12-12 13:01:25 -08:00
oobabooga
39d2fe1ed9
Jinja templates for Instruct and Chat (#4874) 2023-12-12 17:23:14 -03:00
oobabooga
181743fd97 Fix missing spaces tokenizer issue (closes #4834) 2023-12-08 05:16:46 -08:00
Yiximail
1c74b3ab45
Fix partial unicode characters issue (#4837) 2023-12-08 09:50:53 -03:00
oobabooga
6430acadde Minor bug fix after https://github.com/oobabooga/text-generation-webui/pull/4814 2023-12-05 10:08:11 -08:00
oobabooga
0f828ea441 Do not limit API updates/second 2023-12-04 20:45:43 -08:00
oobabooga
9edb193def
Optimize HF text generation (#4814) 2023-12-05 00:00:40 -03:00
tsukanov-as
9f7ae6bb2e
fix detection of stopping strings when HTML escaping is used (#4728) 2023-11-27 15:42:08 -03:00
oobabooga
1b69694fe9 Add types to the encode/decode/token-count endpoints 2023-11-07 19:32:14 -08:00