oobabooga
|
b0e8cb8c88
|
Various fixes in chat mode
|
2023-03-12 02:31:45 -03:00 |
|
oobabooga
|
0bd5430988
|
Use 'with' statement to better handle streaming memory
|
2023-03-12 02:04:28 -03:00 |
|
oobabooga
|
37f0166b2d
|
Fix memory leak in new streaming (second attempt)
|
2023-03-11 23:14:49 -03:00 |
|
oobabooga
|
92fe947721
|
Merge branch 'main' into new-streaming
|
2023-03-11 19:59:45 -03:00 |
|
oobabooga
|
026d60bd34
|
Remove default preset that didn't do anything
|
2023-03-10 14:01:02 -03:00 |
|
oobabooga
|
e9dbdafb14
|
Merge branch 'main' into pt-path-changes
|
2023-03-10 11:03:42 -03:00 |
|
oobabooga
|
706a03b2cb
|
Minor changes
|
2023-03-10 11:02:25 -03:00 |
|
oobabooga
|
de7dd8b6aa
|
Add comments
|
2023-03-10 10:54:08 -03:00 |
|
oobabooga
|
e461c0b7a0
|
Move the import to the top
|
2023-03-10 10:51:12 -03:00 |
|
deepdiffuser
|
9fbd60bf22
|
add no_split_module_classes to prevent tensor split error
|
2023-03-10 05:30:47 -08:00 |
|
deepdiffuser
|
ab47044459
|
add multi-gpu support for 4bit gptq LLaMA
|
2023-03-10 04:52:45 -08:00 |
|
rohvani
|
2ac2913747
|
fix reference issue
|
2023-03-09 20:13:23 -08:00 |
|
rohvani
|
826e297b0e
|
add llama-65b-4bit support & multiple pt paths
|
2023-03-09 18:31:32 -08:00 |
|
oobabooga
|
9849aac0f1
|
Don't show .pt models in the list
|
2023-03-09 21:54:50 -03:00 |
|
oobabooga
|
74102d5ee4
|
Insert to the path instead of appending
|
2023-03-09 20:51:22 -03:00 |
|
oobabooga
|
2965aa1625
|
Check if the .pt file exists
|
2023-03-09 20:48:51 -03:00 |
|
oobabooga
|
828a524f9a
|
Add LLaMA 4-bit support
|
2023-03-09 15:50:26 -03:00 |
|
oobabooga
|
59b5f7a4b7
|
Improve usage of stopping_criteria
|
2023-03-08 12:13:40 -03:00 |
|
oobabooga
|
add9330e5e
|
Bug fixes
|
2023-03-08 11:26:29 -03:00 |
|
oobabooga
|
33fb6aed74
|
Minor bug fix
|
2023-03-08 03:08:16 -03:00 |
|
oobabooga
|
ad2970374a
|
Readability improvements
|
2023-03-08 03:00:06 -03:00 |
|
oobabooga
|
72d539dbff
|
Better separate the FlexGen case
|
2023-03-08 02:54:47 -03:00 |
|
oobabooga
|
0e16c0bacb
|
Remove redeclaration of a function
|
2023-03-08 02:50:49 -03:00 |
|
oobabooga
|
ab50f80542
|
New text streaming method (much faster)
|
2023-03-08 02:46:35 -03:00 |
|
oobabooga
|
8e89bc596b
|
Fix encode() for RWKV
|
2023-03-07 23:15:46 -03:00 |
|
oobabooga
|
19a34941ed
|
Add proper streaming to RWKV
|
2023-03-07 18:17:56 -03:00 |
|
oobabooga
|
8660227e1b
|
Add top_k to RWKV
|
2023-03-07 17:24:28 -03:00 |
|
oobabooga
|
153dfeb4dd
|
Add --rwkv-cuda-on parameter, bump rwkv version
|
2023-03-06 20:12:54 -03:00 |
|
oobabooga
|
6904a507c6
|
Change some parameters
|
2023-03-06 16:29:43 -03:00 |
|
oobabooga
|
20bd645f6a
|
Fix bug in multigpu setups (attempt 3)
|
2023-03-06 15:58:18 -03:00 |
|
oobabooga
|
09a7c36e1b
|
Minor improvement while running custom models
|
2023-03-06 15:36:35 -03:00 |
|
oobabooga
|
24c4c20391
|
Fix bug in multigpu setups (attempt #2)
|
2023-03-06 15:23:29 -03:00 |
|
oobabooga
|
d88b7836c6
|
Fix bug in multigpu setups
|
2023-03-06 14:58:30 -03:00 |
|
oobabooga
|
5bed607b77
|
Increase repetition frequency/penalty for RWKV
|
2023-03-06 14:25:48 -03:00 |
|
oobabooga
|
bf56b6c1fb
|
Load settings.json without the need for --settings settings.json
This is for setting UI defaults
|
2023-03-06 10:57:45 -03:00 |
|
oobabooga
|
e91f4bc25a
|
Add RWKV tokenizer
|
2023-03-06 08:45:49 -03:00 |
|
oobabooga
|
c855b828fe
|
Better handle <USER>
|
2023-03-05 17:01:47 -03:00 |
|
oobabooga
|
2af66a4d4c
|
Fix <USER> in pygmalion replies
|
2023-03-05 16:08:50 -03:00 |
|
oobabooga
|
a54b91af77
|
Improve readability
|
2023-03-05 10:21:15 -03:00 |
|
oobabooga
|
8e706df20e
|
Fix a memory leak when text streaming is on
|
2023-03-05 10:12:43 -03:00 |
|
oobabooga
|
c33715ad5b
|
Move towards HF LLaMA implementation
|
2023-03-05 01:20:31 -03:00 |
|
oobabooga
|
bd8aac8fa4
|
Add LLaMA 8-bit support
|
2023-03-04 13:28:42 -03:00 |
|
oobabooga
|
c93f1fa99b
|
Count the tokens more conservatively
|
2023-03-04 03:10:21 -03:00 |
|
oobabooga
|
ed8b35efd2
|
Add --pin-weight parameter for FlexGen
|
2023-03-04 01:04:02 -03:00 |
|
oobabooga
|
05e703b4a4
|
Print the performance information more reliably
|
2023-03-03 21:24:32 -03:00 |
|
oobabooga
|
5a79863df3
|
Increase the sequence length, decrease batch size
I have no idea what I am doing
|
2023-03-03 15:54:13 -03:00 |
|
oobabooga
|
a345a2acd2
|
Add a tokenizer placeholder
|
2023-03-03 15:16:55 -03:00 |
|
oobabooga
|
5b354817f6
|
Make chat minimally work with LLaMA
|
2023-03-03 15:04:41 -03:00 |
|
oobabooga
|
ea5c5eb3da
|
Add LLaMA support
|
2023-03-03 14:39:14 -03:00 |
|
oobabooga
|
2bff646130
|
Stop chat from flashing dark when processing
|
2023-03-03 13:19:13 -03:00 |
|