kalomaze
b6077b02e4
Quadratic sampling ( #5403 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-02-04 00:20:02 -03:00
Badis Ghoubali
40c7977f9b
Add roleplay.gbnf grammar ( #5368 )
2024-01-28 21:41:28 -03:00
sam-ngu
c0bdcee646
added trust_remote_code to deepspeed init loaderClass ( #5237 )
2024-01-26 11:10:57 -03:00
oobabooga
87dc421ee8
Bump exllamav2 to 0.0.12 ( #5352 )
2024-01-22 22:40:12 -03:00
oobabooga
aad73667af
Lint
2024-01-22 03:25:55 -08:00
lmg-anon
db1da9f98d
Fix logprobs tokens in OpenAI API ( #5339 )
2024-01-22 08:07:42 -03:00
Forkoz
5c5ef4cef7
UI: change n_gpu_layers maximum to 256 for larger models. ( #5262 )
2024-01-17 17:13:16 -03:00
ilya sheprut
4d14eb8b82
LoRA: Fix error "Attempting to unscale FP16 gradients" when training ( #5268 )
2024-01-17 17:11:49 -03:00
oobabooga
e055967974
Add prompt_lookup_num_tokens parameter ( #5296 )
2024-01-17 17:09:36 -03:00
oobabooga
b3fc2cd887
UI: Do not save unchanged extension settings to settings.yaml
2024-01-10 03:48:30 -08:00
oobabooga
53dc1d8197
UI: Do not save unchanged settings to settings.yaml
2024-01-09 18:59:04 -08:00
oobabooga
89e7e107fc
Lint
2024-01-09 16:27:50 -08:00
mamei16
bec4e0a1ce
Fix update event in refresh buttons ( #5197 )
2024-01-09 14:49:37 -03:00
oobabooga
4333d82b9d
Minor bug fix
2024-01-09 06:55:18 -08:00
oobabooga
953343cced
Improve the file saving/deletion menus
2024-01-09 06:33:47 -08:00
oobabooga
123f27a3c5
Load the nearest character after deleting a character
...
Instead of the first.
2024-01-09 06:24:27 -08:00
oobabooga
b908ed318d
Revert "Rename past chats -> chat history"
...
This reverts commit aac93a1fd6
.
2024-01-09 05:26:07 -08:00
oobabooga
4ca82a4df9
Save light/dark theme on "Save UI defaults to settings.yaml"
2024-01-09 04:20:10 -08:00
oobabooga
7af50ede94
Reorder some buttons
2024-01-09 04:11:50 -08:00
oobabooga
a9f49a7574
Confirm the chat history rename with enter
2024-01-09 04:00:53 -08:00
oobabooga
7bdd2118a2
Change some log messages when deleting files
2024-01-09 03:32:01 -08:00
oobabooga
aac93a1fd6
Rename past chats -> chat history
2024-01-09 03:14:30 -08:00
oobabooga
615fa11af8
Move new chat button, improve history deletion handling
2024-01-08 21:22:37 -08:00
oobabooga
4f7e1eeafd
Past chat histories in a side bar on desktop ( #5098 )
...
Lots of room for improvement, but that's a start.
2024-01-09 01:57:29 -03:00
oobabooga
372ef5e2d8
Fix dynatemp parameters always visible
2024-01-08 19:42:31 -08:00
oobabooga
29c2693ea0
dynatemp_low, dynatemp_high, dynatemp_exponent parameters ( #5209 )
2024-01-08 23:28:35 -03:00
oobabooga
c4e005efec
Fix dropdown menus sometimes failing to refresh
2024-01-08 17:49:54 -08:00
oobabooga
9cd2106303
Revert "Add dynamic temperature to the random preset button"
...
This reverts commit 4365fb890f
.
2024-01-08 16:46:24 -08:00
oobabooga
4365fb890f
Add dynamic temperature to the random preset button
2024-01-07 13:08:15 -08:00
oobabooga
0d07b3a6a1
Add dynamic_temperature_low parameter ( #5198 )
2024-01-07 17:03:47 -03:00
oobabooga
b8a0b3f925
Don't print torch tensors with --verbose
2024-01-07 10:35:55 -08:00
oobabooga
cf820c69c5
Print generation parameters with --verbose (HF only)
2024-01-07 10:06:23 -08:00
oobabooga
c4c7fc4ab3
Lint
2024-01-07 09:36:56 -08:00
kalomaze
48327cc5c4
Dynamic Temperature HF loader support ( #5174 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-01-07 10:36:26 -03:00
oobabooga
248742df1c
Save extension fields to settings.yaml on "Save UI defaults"
2024-01-04 20:33:42 -08:00
oobabooga
c9d814592e
Increase maximum temperature value to 5
2024-01-04 17:28:15 -08:00
oobabooga
e4d724eb3f
Fix cache_folder bug introduced in 37eff915d6
2024-01-04 07:49:40 -08:00
Alberto Cano
37eff915d6
Use --disk-cache-dir for all caches
2024-01-04 00:27:26 -03:00
Lounger
7965f6045e
Fix loading latest history for file names with dots ( #5162 )
2024-01-03 22:39:41 -03:00
AstrisCantCode
b80e6365d0
Fix various bugs for LoRA training ( #5161 )
2024-01-03 20:42:20 -03:00
oobabooga
7cce88c403
Rmove an unncecessary exception
2024-01-02 07:20:59 -08:00
oobabooga
94afa0f9cf
Minor style changes
2024-01-01 16:00:22 -08:00
oobabooga
cbf6f9e695
Update some UI messages
2023-12-30 21:31:17 -08:00
oobabooga
2aad91f3c9
Remove deprecated command-line flags ( #5131 )
2023-12-31 02:07:48 -03:00
oobabooga
2734ce3e4c
Remove RWKV loader ( #5130 )
2023-12-31 02:01:40 -03:00
oobabooga
0e54a09bcb
Remove exllamav1 loaders ( #5128 )
2023-12-31 01:57:06 -03:00
oobabooga
8e397915c9
Remove --sdp-attention, --xformers flags ( #5126 )
2023-12-31 01:36:51 -03:00
B611
b7dd1f9542
Specify utf-8 encoding for model metadata file open ( #5125 )
2023-12-31 01:34:32 -03:00
oobabooga
c06f630bcc
Increase max_updates_second maximum value
2023-12-24 13:29:47 -08:00
oobabooga
8c60495878
UI: add "Maximum UI updates/second" parameter
2023-12-24 09:17:40 -08:00
zhangningboo
1b8b61b928
Fix output_ids decoding for Qwen/Qwen-7B-Chat ( #5045 )
2023-12-22 23:11:02 -03:00
Yiximail
afc91edcb2
Reset the model_name after unloading the model ( #5051 )
2023-12-22 22:18:24 -03:00
oobabooga
2706149c65
Organize the CMD arguments by group ( #5027 )
2023-12-21 00:33:55 -03:00
oobabooga
c727a70572
Remove redundancy from modules/loaders.py
2023-12-20 19:18:07 -08:00
luna
6efbe3009f
let exllama v1 models load safetensor loras ( #4854 )
2023-12-20 13:29:19 -03:00
oobabooga
bcba200790
Fix EOS being ignored in ExLlamav2 after previous commit
2023-12-20 07:54:06 -08:00
oobabooga
f0f6d9bdf9
Add HQQ back & update version
...
This reverts commit 2289e9031e
.
2023-12-20 07:46:09 -08:00
oobabooga
b15f510154
Optimize ExLlamav2 (non-HF) loader
2023-12-20 07:31:42 -08:00
oobabooga
fadb295d4d
Lint
2023-12-19 21:36:57 -08:00
oobabooga
fb8ee9f7ff
Add a specific error if HQQ is missing
2023-12-19 21:32:58 -08:00
oobabooga
9992f7d8c0
Improve several log messages
2023-12-19 20:54:32 -08:00
oobabooga
23818dc098
Better logger
...
Credits: vladmandic/automatic
2023-12-19 20:38:33 -08:00
oobabooga
95600073bc
Add an informative error when extension requirements are missing
2023-12-19 20:20:45 -08:00
oobabooga
d8279dc710
Replace character name placeholders in chat context ( closes #5007 )
2023-12-19 17:31:46 -08:00
oobabooga
e83e6cedbe
Organize the model menu
2023-12-19 13:18:26 -08:00
oobabooga
f4ae0075e8
Fix conversion from old template format to jinja2
2023-12-19 13:16:52 -08:00
oobabooga
de138b8ba6
Add llama-cpp-python wheels with tensor cores support ( #5003 )
2023-12-19 17:30:53 -03:00
oobabooga
0a299d5959
Bump llama-cpp-python to 0.2.24 ( #5001 )
2023-12-19 15:22:21 -03:00
oobabooga
83cf1a6b67
Fix Yi space issue ( closes #4996 )
2023-12-19 07:54:19 -08:00
oobabooga
9847809a7a
Add a warning about ppl evaluation without --no_use_fast
2023-12-18 18:09:24 -08:00
oobabooga
f6d701624c
UI: mention that QuIP# does not work on Windows
2023-12-18 18:05:02 -08:00
oobabooga
a23a004434
Update the example template
2023-12-18 17:47:35 -08:00
Water
674be9a09a
Add HQQ quant loader ( #4888 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-12-18 21:23:16 -03:00
oobabooga
1f9e25e76a
UI: update "Saved instruction templates" dropdown after loading template
2023-12-17 21:19:06 -08:00
oobabooga
da1c8d77ea
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2023-12-17 21:05:10 -08:00
oobabooga
cac89df97b
Instruction templates: better handle unwanted bos tokens
2023-12-17 21:04:30 -08:00
oobabooga
f0d6ead877
llama.cpp: read instruction template from GGUF metadata ( #4975 )
2023-12-18 01:51:58 -03:00
oobabooga
f1f2c4c3f4
Add --num_experts_per_token parameter (ExLlamav2) ( #4955 )
2023-12-17 12:08:33 -03:00
oobabooga
12690d3ffc
Better HF grammar implementation ( #4953 )
2023-12-17 02:01:23 -03:00
oobabooga
f8079d067d
UI: save the sent chat message on "no model is loaded" error
2023-12-16 10:52:41 -08:00
oobabooga
3bbf6c601d
AutoGPTQ: Add --disable_exllamav2 flag (Mixtral CPU offloading needs this)
2023-12-15 06:46:13 -08:00
oobabooga
2cb5b68ad9
Bug fix: when generation fails, save the sent message ( #4915 )
2023-12-15 01:01:45 -03:00
Kim Jaewon
e53f99faa0
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint ( #4916 )
2023-12-15 00:22:43 -03:00
Lounger
5754f0c357
Fix deleting chat logs ( #4914 )
2023-12-13 21:54:43 -03:00
Bartowski
f51156705d
Allow symlinked folder within root directory ( #4863 )
2023-12-13 18:08:21 -03:00
Ixion
3f3960dbfb
Fixed invalid Jinja2 syntax in instruction templates ( #4911 )
2023-12-13 15:46:23 -03:00
oobabooga
fcf5512364
Jinja templates: fix a potential small bug
2023-12-13 10:19:39 -08:00
oobabooga
7f1a6a70e3
Update the llamacpp_HF comment
2023-12-12 21:04:20 -08:00
oobabooga
1c531a3713
Minor cleanup
2023-12-12 13:25:21 -08:00
oobabooga
8513028968
Fix lag in the chat tab during streaming
2023-12-12 13:01:25 -08:00
oobabooga
39d2fe1ed9
Jinja templates for Instruct and Chat ( #4874 )
2023-12-12 17:23:14 -03:00
oobabooga
aab0dd962d
Revert "Update callbacks.py to show tracebacks on ValueError ( #4892 )"
...
This reverts commit 993ca51a65
.
2023-12-12 11:47:11 -08:00
Nehereus
993ca51a65
Update callbacks.py to show tracebacks on ValueError ( #4892 )
2023-12-12 02:29:27 -03:00
Morgan Schweers
602b8c6210
Make new browser reloads recognize current model. ( #4865 )
2023-12-11 02:51:01 -03:00
oobabooga
8c8825b777
Add QuIP# to README
2023-12-08 08:40:42 -08:00
oobabooga
2a335b8aa7
Cleanup: set shared.model_name only once
2023-12-08 06:35:23 -08:00
oobabooga
62d59a516f
Add trust_remote_code to all HF loaders
2023-12-08 06:29:26 -08:00
oobabooga
181743fd97
Fix missing spaces tokenizer issue ( closes #4834 )
2023-12-08 05:16:46 -08:00
Yiximail
1c74b3ab45
Fix partial unicode characters issue ( #4837 )
2023-12-08 09:50:53 -03:00
oobabooga
2c5a1e67f9
Parameters: change max_new_tokens & repetition_penalty_range defaults ( #4842 )
2023-12-07 20:04:52 -03:00
oobabooga
98361af4d5
Add QuIP# support ( #4803 )
...
It has to be installed manually for now.
2023-12-06 00:01:01 -03:00
oobabooga
6430acadde
Minor bug fix after https://github.com/oobabooga/text-generation-webui/pull/4814
2023-12-05 10:08:11 -08:00
oobabooga
0f828ea441
Do not limit API updates/second
2023-12-04 20:45:43 -08:00
oobabooga
9edb193def
Optimize HF text generation ( #4814 )
2023-12-05 00:00:40 -03:00
俞航
ac9f154bcc
Bump exllamav2 from 0.0.8 to 0.0.10 & Fix code change ( #4782 )
2023-12-04 21:15:05 -03:00
oobabooga
131a5212ce
UI: update context upper limit to 200000
2023-12-04 15:48:34 -08:00
oobabooga
be88b072e9
Update --loader flag description
2023-12-04 15:41:25 -08:00
oobabooga
7fc9033b2e
Recommend ExLlama_HF and ExLlamav2_HF
2023-12-04 15:28:46 -08:00
Lounger
7c0a17962d
Gallery improvements ( #4789 )
2023-12-03 22:45:50 -03:00
oobabooga
77d6ccf12b
Add a LOADER debug message while loading models
2023-11-30 12:00:32 -08:00
oobabooga
092a2c3516
Fix a bug in llama.cpp get_logits() function
2023-11-30 11:21:40 -08:00
oobabooga
2698d7c9fd
Fix llama.cpp model unloading
2023-11-29 15:19:48 -08:00
oobabooga
9940ed9c77
Sort the loaders
2023-11-29 15:13:03 -08:00
oobabooga
a7670c31ca
Sort
2023-11-28 18:43:33 -08:00
oobabooga
6e51bae2e0
Sort the loaders menu
2023-11-28 18:41:11 -08:00
oobabooga
68059d7c23
llama.cpp: minor log change & lint
2023-11-27 10:44:55 -08:00
tsukanov-as
9f7ae6bb2e
fix detection of stopping strings when HTML escaping is used ( #4728 )
2023-11-27 15:42:08 -03:00
oobabooga
0589ff5b12
Bump llama-cpp-python to 0.2.19 & add min_p and typical_p parameters to llama.cpp loader ( #4701 )
2023-11-21 20:59:39 -03:00
oobabooga
2769a1fa25
Hide deprecated args from Session tab
2023-11-21 15:15:16 -08:00
oobabooga
a2e6d00128
Use convert_ids_to_tokens instead of decode in logits endpoint
...
This preserves the llama tokenizer spaces.
2023-11-19 09:22:08 -08:00
oobabooga
9da7bb203d
Minor LoRA bug fix
2023-11-19 07:59:29 -08:00
oobabooga
a6f1e1bcc5
Fix PEFT LoRA unloading
2023-11-19 07:55:25 -08:00
oobabooga
ab94f0d9bf
Minor style change
2023-11-18 21:11:04 -08:00
oobabooga
5fcee696ea
New feature: enlarge character pictures on click ( #4654 )
2023-11-19 02:05:17 -03:00
oobabooga
ef6feedeb2
Add --nowebui flag for pure API mode ( #4651 )
2023-11-18 23:38:39 -03:00
oobabooga
0fa1af296c
Add /v1/internal/logits endpoint ( #4650 )
2023-11-18 23:19:31 -03:00
oobabooga
8f4f4daf8b
Add --admin-key flag for API ( #4649 )
2023-11-18 22:33:27 -03:00
Jordan Tucker
baab894759
fix: use system message in chat-instruct mode ( #4648 )
2023-11-18 20:20:13 -03:00
oobabooga
47d9e2618b
Refresh the Preset menu after saving a preset
2023-11-18 14:03:42 -08:00
oobabooga
83b64e7fc1
New feature: "random preset" button ( #4647 )
2023-11-18 18:31:41 -03:00
oobabooga
e0ca49ed9c
Bump llama-cpp-python to 0.2.18 (2nd attempt) ( #4637 )
...
* Update requirements*.txt
* Add back seed
2023-11-18 00:31:27 -03:00
oobabooga
9d6f79db74
Revert "Bump llama-cpp-python to 0.2.18 ( #4611 )"
...
This reverts commit 923c8e25fb
.
2023-11-17 05:14:25 -08:00
oobabooga
13dc3b61da
Update README
2023-11-16 19:57:55 -08:00
oobabooga
8b66d83aa9
Set use_fast=True by default, create --no_use_fast flag
...
This increases tokens/second for HF loaders.
2023-11-16 19:55:28 -08:00
oobabooga
6525707a7f
Fix "send instruction template to..." buttons ( closes #4625 )
2023-11-16 18:16:42 -08:00
oobabooga
510a01ef46
Lint
2023-11-16 18:03:06 -08:00
oobabooga
923c8e25fb
Bump llama-cpp-python to 0.2.18 ( #4611 )
2023-11-16 22:55:14 -03:00
oobabooga
58c6001be9
Add missing exllamav2 samplers
2023-11-16 07:09:40 -08:00
oobabooga
cd41f8912b
Warn users about n_ctx / max_seq_len
2023-11-15 18:56:42 -08:00
oobabooga
9be48e83a9
Start API when "api" checkbox is checked
2023-11-15 16:35:47 -08:00
oobabooga
a85ce5f055
Add more info messages for truncation / instruction template
2023-11-15 16:20:31 -08:00
oobabooga
883701bc40
Alternative solution to 025da386a0
...
Fixes an error.
2023-11-15 16:04:02 -08:00
oobabooga
8ac942813c
Revert "Fix CPU memory limit error (issue #3763 ) ( #4597 )"
...
This reverts commit 025da386a0
.
2023-11-15 16:01:54 -08:00
oobabooga
e6f44d6d19
Print context length / instruction template to terminal when loading models
2023-11-15 16:00:51 -08:00
oobabooga
e05d8fd441
Style changes
2023-11-15 15:51:37 -08:00
Andy Bao
025da386a0
Fix CPU memory limit error (issue #3763 ) ( #4597 )
...
get_max_memory_dict() was not properly formatting shared.args.cpu_memory
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-11-15 20:27:20 -03:00
oobabooga
4aabff3728
Remove old API, launch OpenAI API with --api
2023-11-10 06:39:08 -08:00
oobabooga
2af7e382b1
Revert "Bump llama-cpp-python to 0.2.14"
...
This reverts commit 5c3eb22ce6
.
The new version has issues:
https://github.com/oobabooga/text-generation-webui/issues/4540
https://github.com/abetlen/llama-cpp-python/issues/893
2023-11-09 10:02:13 -08:00
oobabooga
21ed9a260e
Document the new "Custom system message" field
2023-11-08 17:54:10 -08:00
oobabooga
2358706453
Add /v1/internal/model/load endpoint (tentative)
2023-11-07 20:58:06 -08:00
oobabooga
43c53a7820
Refactor the /v1/models endpoint
2023-11-07 19:59:27 -08:00
oobabooga
1b69694fe9
Add types to the encode/decode/token-count endpoints
2023-11-07 19:32:14 -08:00
oobabooga
6e2e0317af
Separate context and system message in instruction formats ( #4499 )
2023-11-07 20:02:58 -03:00
oobabooga
5c0559da69
Training: fix .txt files now showing in dropdowns
2023-11-07 14:41:11 -08:00
oobabooga
af3d25a503
Disable logits_all in llamacpp_HF (makes processing 3x faster)
2023-11-07 14:35:48 -08:00
oobabooga
5c3eb22ce6
Bump llama-cpp-python to 0.2.14
2023-11-07 14:20:43 -08:00
oobabooga
ec17a5d2b7
Make OpenAI API the default API ( #4430 )
2023-11-06 02:38:29 -03:00
feng lui
4766a57352
transformers: add use_flash_attention_2 option ( #4373 )
2023-11-04 13:59:33 -03:00
wouter van der plas
add359379e
fixed two links in the ui ( #4452 )
2023-11-04 13:41:42 -03:00
oobabooga
aa5d671579
Add temperature_last parameter ( #4472 )
2023-11-04 13:09:07 -03:00
oobabooga
1ab8700d94
Change frequency/presence penalty ranges
2023-11-03 17:38:19 -07:00
oobabooga
45fcb60e7a
Make truncation_length_max apply to max_seq_len/n_ctx
2023-11-03 11:29:31 -07:00
oobabooga
7f9c1cbb30
Change min_p default to 0.0
2023-11-03 08:25:22 -07:00
oobabooga
4537853e2c
Change min_p default to 1.0
2023-11-03 08:13:50 -07:00
kalomaze
367e5e6e43
Implement Min P as a sampler option in HF loaders ( #4449 )
2023-11-02 16:32:51 -03:00
oobabooga
fcb7017b7a
Remove a checkbox
2023-11-02 12:24:09 -07:00
Julien Chaumond
fdcaa955e3
transformers: Add a flag to force load from safetensors ( #4450 )
2023-11-02 16:20:54 -03:00
oobabooga
c0655475ae
Add cache_8bit option
2023-11-02 11:23:04 -07:00
oobabooga
42f816312d
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2023-11-02 11:09:26 -07:00
oobabooga
77abd9b69b
Add no_flash_attn option
2023-11-02 11:08:53 -07:00
Julien Chaumond
a56ef2a942
make torch.load a bit safer ( #4448 )
2023-11-02 14:07:08 -03:00
Mehran Ziadloo
aaf726dbfb
Updating the shared settings object when loading a model ( #4425 )
2023-11-01 01:29:57 -03:00
oobabooga
9bd0724d85
Change frequency/presence penalty ranges
2023-10-31 20:57:56 -07:00
Meheret
0707ed7677
updated wiki link ( #4415 )
2023-10-31 19:09:05 -03:00
oobabooga
262f8ae5bb
Use default gr.Dataframe for evaluation table
2023-10-27 06:49:14 -07:00
oobabooga
839a87bac8
Fix is_ccl_available & is_xpu_available imports
2023-10-26 20:27:04 -07:00
Abhilash Majumder
778a010df8
Intel Gpu support initialization ( #4340 )
2023-10-26 23:39:51 -03:00
oobabooga
92b2f57095
Minor metadata bug fix (second attempt)
2023-10-26 18:57:32 -07:00
tdrussell
72f6fc6923
Rename additive_repetition_penalty to presence_penalty, add frequency_penalty ( #4376 )
2023-10-25 12:10:28 -03:00
oobabooga
ef1489cd4d
Remove unused parameter in AutoAWQ
2023-10-23 20:45:43 -07:00
oobabooga
1edf321362
Lint
2023-10-23 13:09:03 -07:00
oobabooga
280ae720d7
Organize
2023-10-23 13:07:17 -07:00
oobabooga
49e5eecce4
Merge remote-tracking branch 'refs/remotes/origin/main'
2023-10-23 12:54:05 -07:00
oobabooga
306d764ff6
Minor metadata bug fix
2023-10-23 12:46:24 -07:00
adrianfiedler
4bc411332f
Fix broken links ( #4367 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-23 14:09:57 -03:00
oobabooga
92691ee626
Disable trust_remote_code by default
2023-10-23 09:57:44 -07:00
tdrussell
4440f87722
Add additive_repetition_penalty sampler setting. ( #3627 )
2023-10-23 02:28:07 -03:00
oobabooga
df90d03e0b
Replace --mul_mat_q with --no_mul_mat_q
2023-10-22 12:23:03 -07:00
Googulator
d0c3b407b3
transformers loader: multi-LoRAs support ( #3120 )
2023-10-22 16:06:22 -03:00
omo
4405513ca5
Option to select/target additional linear modules/layers in LORA training ( #4178 )
2023-10-22 15:57:19 -03:00
oobabooga
2d1b3332e4
Ignore warnings on Colab
2023-10-21 21:45:25 -07:00
oobabooga
09f807af83
Use ExLlama_HF for GPTQ models by default
2023-10-21 20:45:38 -07:00
oobabooga
506d05aede
Organize command-line arguments
2023-10-21 18:52:59 -07:00
oobabooga
fbac6d21ca
Add missing exception
2023-10-20 23:53:24 -07:00
Brian Dashore
3345da2ea4
Add flash-attention 2 for windows ( #4235 )
2023-10-21 03:46:23 -03:00
Johan
1d5a015ce7
Enable special token support for exllamav2 ( #4314 )
2023-10-21 01:54:06 -03:00
turboderp
ae8cd449ae
ExLlamav2_HF: Convert logits to FP32 ( #4310 )
2023-10-18 23:16:05 -03:00
oobabooga
f17f7a6913
Increase the evaluation table height
2023-10-16 12:55:35 -07:00
oobabooga
8ea554bc19
Check for torch.xpu.is_available()
2023-10-16 12:53:40 -07:00
oobabooga
188d20e9e5
Reduce the evaluation table height
2023-10-16 10:53:42 -07:00
oobabooga
2d44adbb76
Clear the torch cache while evaluating
2023-10-16 10:52:50 -07:00
oobabooga
71cac7a1b2
Increase the height of the evaluation table
2023-10-15 21:56:40 -07:00
oobabooga
e14bde4946
Minor improvements to evaluation logs
2023-10-15 20:51:43 -07:00
oobabooga
b88b2b74a6
Experimental Intel Arc transformers support (untested)
2023-10-15 20:51:11 -07:00
Forkoz
8cce1f1126
Exllamav2 lora support ( #4229 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2023-10-14 16:12:41 -03:00
oobabooga
773c17faec
Fix a warning
2023-10-10 20:53:38 -07:00
oobabooga
f63361568c
Fix safetensors kwarg usage in AutoAWQ
2023-10-10 19:03:09 -07:00
oobabooga
39f16ff83d
Fix default/notebook tabs css
2023-10-10 18:45:12 -07:00
oobabooga
fae8062d39
Bump to latest gradio (3.47) ( #4258 )
2023-10-10 22:20:49 -03:00
oobabooga
9fab9a1ca6
Minor fix
2023-10-10 14:08:11 -07:00
oobabooga
a49cc69a4a
Ignore rope_freq_base if value is 10000
2023-10-10 13:57:40 -07:00
oobabooga
3a9d90c3a1
Download models with 4 threads by default
2023-10-10 13:52:10 -07:00
Forkoz
35695e18c7
Remove import. ( #4247 )
...
For real this time.
2023-10-09 18:06:11 -03:00
Forkoz
2e471071af
Update llama_attn_hijack.py ( #4231 )
2023-10-08 15:16:48 -03:00
Brian Dashore
98fa73a974
Text Generation: stop if EOS token is reached ( #4213 )
2023-10-07 19:46:42 -03:00
Brian Dashore
7743b5e9de
Llamacpp_HF: Fix CFG cache init ( #4219 )
...
Documentation says that model.context_params should be sent when
a new context is created. The current code uses model.params which
doesn't exist.
Signed-off-by: kingbri <bdashore3@proton.me>
2023-10-07 19:38:29 -03:00
turboderp
8a98646a21
Bump ExLlamaV2 to 0.0.5 ( #4186 )
2023-10-05 19:12:22 -03:00
oobabooga
7ffb424c7b
Add AutoAWQ to README
2023-10-05 09:22:37 -07:00
cal066
cc632c3f33
AutoAWQ: initial support ( #3999 )
2023-10-05 13:19:18 -03:00
tdrussell
cb26163a20
Fix off-by-one error in exllama_hf caching logic ( #4145 )
2023-10-05 12:20:56 -03:00
oobabooga
ae4ba3007f
Add grammar to transformers and _HF loaders ( #4091 )
2023-10-05 10:01:36 -03:00
oobabooga
b6fe6acf88
Add threads_batch parameter
2023-10-01 21:28:00 -07:00
jllllll
41a2de96e5
Bump llama-cpp-python to 0.2.11
2023-10-01 18:08:10 -05:00
oobabooga
f2d82f731a
Add recommended NTKv1 alpha values
2023-09-29 13:48:38 -07:00
oobabooga
abe99cddeb
Extend evaluation slider bounds
2023-09-29 13:06:26 -07:00
oobabooga
96da2e1c0d
Read more metadata (config.json & quantize_config.json)
2023-09-29 06:14:16 -07:00
oobabooga
56b5a4af74
exllamav2 typical_p
2023-09-28 20:10:12 -07:00
oobabooga
f8e9733412
Minor syntax change
2023-09-28 19:32:35 -07:00
oobabooga
f931184b53
Increase truncation limits to 32768
2023-09-28 19:28:22 -07:00
oobabooga
1dd13e4643
Read Transformers config.json metadata
2023-09-28 19:19:47 -07:00
StoyanStAtanasov
7e6ff8d1f0
Enable NUMA feature for llama_cpp_python ( #4040 )
2023-09-26 22:05:00 -03:00
oobabooga
87ea2d96fd
Add a note about RWKV loader
2023-09-26 17:43:39 -07:00
oobabooga
0c89180966
Another minor fix
2023-09-26 06:54:21 -07:00
oobabooga
365335e1ae
Minor fix
2023-09-26 06:47:19 -07:00
oobabooga
1ca54faaf0
Improve --multi-user mode
2023-09-26 06:42:33 -07:00
oobabooga
019371c0b6
Lint
2023-09-25 20:31:11 -07:00
oobabooga
814520fed1
Extension install improvements
2023-09-25 20:27:06 -07:00
oobabooga
7f1460af29
Change a warning
2023-09-25 20:22:27 -07:00
oobabooga
862b45b1c7
Extension install improvements
2023-09-25 19:48:30 -07:00
oobabooga
c8952cce55
Move documentation from UI to docs/
2023-09-25 12:28:28 -07:00
oobabooga
d0d221df49
Add --use_fast option ( closes #3741 )
2023-09-25 12:19:43 -07:00
oobabooga
b973b91d73
Automatically filter by loader ( closes #4072 )
2023-09-25 10:28:35 -07:00
oobabooga
63de9eb24f
Clean up the transformers loader
2023-09-24 20:26:26 -07:00
oobabooga
36c38d7561
Add disable_exllama to Transformers loader (for GPTQ LoRA training)
2023-09-24 20:03:11 -07:00
oobabooga
55a685d999
Minor fixes
2023-09-24 14:15:10 -07:00
oobabooga
08cf150c0c
Add a grammar editor to the UI ( #4061 )
2023-09-24 18:05:24 -03:00
oobabooga
eb0b7c1053
Fix a minor UI bug
2023-09-24 07:17:33 -07:00
oobabooga
3edac43426
Remove print statement
2023-09-24 07:13:00 -07:00
oobabooga
b227e65d86
Add grammar to llama.cpp loader ( closes #4019 )
2023-09-24 07:10:45 -07:00
oobabooga
2e7b6b0014
Create alternative requirements.txt with AMD and Metal wheels ( #4052 )
2023-09-24 09:58:29 -03:00