ddh0
65c4c15314
only support GGUFv3
2024-06-27 15:25:11 -05:00
ddh0
8a6784a7d3
Merge branch 'oobabooga:main' into main
2024-06-27 15:18:31 -05:00
oobabooga
8ec8bc0b85
UI: handle another edge case while streaming lists
2024-06-26 18:40:43 -07:00
oobabooga
0e138e4be1
Merge remote-tracking branch 'refs/remotes/origin/dev' into dev
2024-06-26 18:30:08 -07:00
mefich
a85749dcbe
Update models_settings.py: add default alpha_value, add proper compress_pos_emb for newer GGUFs ( #6111 )
2024-06-26 22:17:56 -03:00
oobabooga
5fe532a5ce
UI: remove DRY info text
...
It was visible for loaders without DRY.
2024-06-26 15:33:11 -07:00
oobabooga
b1187fc9a5
UI: prevent flickering while streaming lists / bullet points
2024-06-25 19:19:45 -07:00
oobabooga
3691451d00
Add back the "Rename chat" feature ( #6161 )
2024-06-25 22:28:58 -03:00
oobabooga
ac3f92d36a
UI: store chat history in the browser
2024-06-25 14:18:07 -07:00
oobabooga
46ca15cb79
Minor bug fixes after e7e1f5901e
2024-06-25 11:49:33 -07:00
oobabooga
83534798b2
UI: move "Character" dropdown to the main Chat tab
2024-06-25 11:25:57 -07:00
oobabooga
279cba607f
UI: don't show an animation when updating the "past chats" menu
2024-06-25 11:10:17 -07:00
oobabooga
3290edfad9
Bug fix: force chat history to be loaded on launch
2024-06-25 11:06:05 -07:00
oobabooga
e7e1f5901e
Prompts in the "past chats" menu ( #6160 )
2024-06-25 15:01:43 -03:00
oobabooga
a43c210617
Improved past chats menu ( #6158 )
2024-06-25 00:07:22 -03:00
oobabooga
96ba53d916
Handle another fix after 57119c1b30
2024-06-24 15:51:12 -07:00
oobabooga
577a8cd3ee
Add TensorRT-LLM support ( #5715 )
2024-06-24 02:30:03 -03:00
oobabooga
536f8d58d4
Do not expose alpha_value to llama.cpp & rope_freq_base to transformers
...
To avoid confusion
2024-06-23 22:09:24 -07:00
oobabooga
b48ab482f8
Remove obsolete "gptq_for_llama_info" message
2024-06-23 22:05:19 -07:00
oobabooga
5e8dc56f8a
Fix after previous commit
2024-06-23 21:58:28 -07:00
Louis Del Valle
57119c1b30
Update block_requests.py to resolve unexpected type error (500 error) ( #5976 )
2024-06-24 01:56:51 -03:00
CharlesCNorton
5993904acf
Fix several typos in the codebase ( #6151 )
2024-06-22 21:40:25 -03:00
GodEmperor785
2c5a9eb597
Change limits of RoPE scaling sliders in UI ( #6142 )
2024-06-19 21:42:17 -03:00
Guanghua Lu
229d89ccfb
Make logs more readable, no more \u7f16\u7801 ( #6127 )
2024-06-15 23:00:13 -03:00
Forkoz
1576227f16
Fix GGUFs with no BOS token present, mainly qwen2 models. ( #6119 )
...
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
2024-06-14 13:51:01 -03:00
oobabooga
10601850d9
Fix after previous commit
2024-06-13 19:54:12 -07:00
oobabooga
0f3a423de1
Alternative solution to "get next logits" deadlock ( #6106 )
2024-06-13 19:34:16 -07:00
oobabooga
386500aa37
Avoid unnecessary calls UI -> backend, to make it faster
2024-06-12 20:52:42 -07:00
Forkoz
1d79aa67cf
Fix flash-attn UI parameter to actually store true. ( #6076 )
2024-06-13 00:34:54 -03:00
Belladore
3abafee696
DRY sampler improvements ( #6053 )
2024-06-12 23:39:11 -03:00
oobabooga
a36fa73071
Lint
2024-06-12 19:00:21 -07:00
oobabooga
2d196ed2fe
Remove obsolete pre_layer parameter
2024-06-12 18:56:44 -07:00
Belladore
46174a2d33
Fix error when bos_token_id is None. ( #6061 )
2024-06-12 22:52:27 -03:00
ddh0
2ac2ed9d5d
improve GGUF metadata handling
...
- Support GGUFv2 (previously only v3 was supported)
- Support execution of big-endian GGUF files on big-endian hosts
- Fail if magic bytes b'GGUF' are wrong
2024-06-01 22:39:04 -05:00
Belladore
a363cdfca1
Fix missing bos token for some models (including Llama-3) ( #6050 )
2024-05-27 09:21:30 -03:00
oobabooga
8df68b05e9
Remove MinPLogitsWarper (it's now a transformers built-in)
2024-05-27 05:03:30 -07:00
oobabooga
4f1e96b9e3
Downloader: Add --model-dir argument, respect --model-dir in the UI
2024-05-23 20:42:46 -07:00
oobabooga
ad54d524f7
Revert "Fix stopping strings for llama-3 and phi ( #6043 )"
...
This reverts commit 5499bc9bc8
.
2024-05-22 17:18:08 -07:00
oobabooga
5499bc9bc8
Fix stopping strings for llama-3 and phi ( #6043 )
2024-05-22 13:53:59 -03:00
oobabooga
9e189947d1
Minor fix after bd7cc4234d
(thanks @belladoreai)
2024-05-21 10:37:30 -07:00
oobabooga
ae86292159
Fix getting Phi-3-small-128k-instruct logits
2024-05-21 10:35:00 -07:00
oobabooga
bd7cc4234d
Backend cleanup ( #6025 )
2024-05-21 13:32:02 -03:00
Philipp Emanuel Weidmann
852c943769
DRY: A modern repetition penalty that reliably prevents looping ( #5677 )
2024-05-19 23:53:47 -03:00
oobabooga
9f77ed1b98
--idle-timeout flag to unload the model if unused for N minutes ( #6026 )
2024-05-19 23:29:39 -03:00
altoiddealer
818b4e0354
Let grammar escape backslashes ( #5865 )
2024-05-19 20:26:09 -03:00
Tisjwlf
907702c204
Fix gguf multipart file loading ( #5857 )
2024-05-19 20:22:09 -03:00
A0nameless0man
5cb59707f3
fix: grammar not support utf-8 ( #5900 )
2024-05-19 20:10:39 -03:00
Samuel Wein
b63dc4e325
UI: Warn user if they are trying to load a model from no path ( #6006 )
2024-05-19 20:05:17 -03:00
chr
6b546a2c8b
llama.cpp: increase the max threads from 32 to 256 ( #5889 )
2024-05-19 20:02:19 -03:00
oobabooga
a38a37b3b3
llama.cpp: default n_gpu_layers to the maximum value for the model automatically
2024-05-19 10:57:42 -07:00