Commit Graph

185 Commits

Author SHA1 Message Date
oobabooga
19a34941ed Add proper streaming to RWKV 2023-03-07 18:17:56 -03:00
oobabooga
8660227e1b Add top_k to RWKV 2023-03-07 17:24:28 -03:00
oobabooga
20bd645f6a Fix bug in multigpu setups (attempt 3) 2023-03-06 15:58:18 -03:00
oobabooga
09a7c36e1b Minor improvement while running custom models 2023-03-06 15:36:35 -03:00
oobabooga
24c4c20391 Fix bug in multigpu setups (attempt #2) 2023-03-06 15:23:29 -03:00
oobabooga
d88b7836c6 Fix bug in multigpu setups 2023-03-06 14:58:30 -03:00
oobabooga
e91f4bc25a Add RWKV tokenizer 2023-03-06 08:45:49 -03:00
oobabooga
a54b91af77 Improve readability 2023-03-05 10:21:15 -03:00
oobabooga
8e706df20e Fix a memory leak when text streaming is on 2023-03-05 10:12:43 -03:00
oobabooga
c33715ad5b Move towards HF LLaMA implementation 2023-03-05 01:20:31 -03:00
oobabooga
c93f1fa99b Count the tokens more conservatively 2023-03-04 03:10:21 -03:00
oobabooga
05e703b4a4 Print the performance information more reliably 2023-03-03 21:24:32 -03:00
oobabooga
a345a2acd2 Add a tokenizer placeholder 2023-03-03 15:16:55 -03:00
oobabooga
5b354817f6 Make chat minimally work with LLaMA 2023-03-03 15:04:41 -03:00
oobabooga
ea5c5eb3da Add LLaMA support 2023-03-03 14:39:14 -03:00
oobabooga
7bbe32f618 Don't return a value in an iterator function 2023-03-02 00:48:46 -03:00
oobabooga
ff9f649c0c Remove some unused imports 2023-03-02 00:36:20 -03:00
oobabooga
955cf431e8 Minor consistency fix 2023-03-01 19:11:26 -03:00
oobabooga
831ac7ed3f Add top_p 2023-03-01 16:45:48 -03:00
oobabooga
7c4d5ca8cc Improve the text generation call a bit 2023-03-01 16:40:25 -03:00
oobabooga
0f6708c471 Sort the imports 2023-03-01 12:18:17 -03:00
oobabooga
e735806c51 Add a generate() function for RWKV 2023-03-01 12:16:11 -03:00
oobabooga
f871971de1 Trying to get the chat to work 2023-02-28 00:25:30 -03:00
oobabooga
ebd698905c Add streaming to RWKV 2023-02-28 00:04:04 -03:00
oobabooga
70e522732c Move RWKV loader into a separate file 2023-02-27 23:50:16 -03:00
oobabooga
ebc64a408c RWKV support prototype 2023-02-27 23:03:35 -03:00
oobabooga
6e843a11d6 Fix FlexGen in chat mode 2023-02-26 00:36:04 -03:00
oobabooga
fa58fd5559 Proper way to free the cuda cache 2023-02-25 15:50:29 -03:00
oobabooga
700311ce40 Empty the cuda cache at model.generate() 2023-02-25 14:39:13 -03:00
oobabooga
78ad55641b Remove duplicate max_new_tokens parameter 2023-02-24 17:19:42 -03:00
oobabooga
65326b545a Move all gradio elements to shared (so that extensions can use them) 2023-02-24 16:46:50 -03:00
oobabooga
9ae063e42b Fix softprompts when deepspeed is active (#112) 2023-02-23 20:22:47 -03:00
oobabooga
7224343a70 Improve the imports 2023-02-23 14:41:42 -03:00
oobabooga
1dacd34165 Further refactor 2023-02-23 13:28:30 -03:00
oobabooga
ce7feb3641 Further refactor 2023-02-23 13:03:52 -03:00