oobabooga
|
9f77ed1b98
|
--idle-timeout flag to unload the model if unused for N minutes (#6026)
|
2024-05-19 23:29:39 -03:00 |
|
wangshuai09
|
fd4e46bce2
|
Add Ascend NPU support (basic) (#5541)
|
2024-04-11 18:42:20 -03:00 |
|
oobabooga
|
2a1063eff5
|
Revert "Remove non-HF ExLlamaV2 loader (#5431)"
This reverts commit cde000d478 .
|
2024-02-06 06:21:36 -08:00 |
|
oobabooga
|
cde000d478
|
Remove non-HF ExLlamaV2 loader (#5431)
|
2024-02-04 01:15:51 -03:00 |
|
kalomaze
|
48327cc5c4
|
Dynamic Temperature HF loader support (#5174)
---------
Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
|
2024-01-07 10:36:26 -03:00 |
|
oobabooga
|
0e54a09bcb
|
Remove exllamav1 loaders (#5128)
|
2023-12-31 01:57:06 -03:00 |
|
Kim Jaewon
|
e53f99faa0
|
[OpenAI Extension] Add 'max_logits' parameter in logits endpoint (#4916)
|
2023-12-15 00:22:43 -03:00 |
|
oobabooga
|
a2e6d00128
|
Use convert_ids_to_tokens instead of decode in logits endpoint
This preserves the llama tokenizer spaces.
|
2023-11-19 09:22:08 -08:00 |
|
oobabooga
|
0fa1af296c
|
Add /v1/internal/logits endpoint (#4650)
|
2023-11-18 23:19:31 -03:00 |
|
Abhilash Majumder
|
778a010df8
|
Intel Gpu support initialization (#4340)
|
2023-10-26 23:39:51 -03:00 |
|
oobabooga
|
b062d50c45
|
Remove exllama import that causes problems
|
2023-09-17 18:00:32 -07:00 |
|
oobabooga
|
ad8ac545a5
|
Tokenization improvements
|
2023-09-17 07:02:00 -07:00 |
|
saltacc
|
cd08eb0753
|
token probs for non HF loaders (#3957)
|
2023-09-17 10:42:32 -03:00 |
|
oobabooga
|
c0b119c3a3
|
Improve logit viewer format
|
2023-08-22 20:35:12 -07:00 |
|
oobabooga
|
8545052c9d
|
Add the option to use samplers in the logit viewer
|
2023-08-22 20:18:16 -07:00 |
|
oobabooga
|
120fb86c6a
|
Add a simple logit viewer (#3636)
|
2023-08-20 20:49:21 -03:00 |
|