Default Branch

30caac3a68 · llama : the WPM vocabs use the CLS token as BOS (#10930) · Updated 2024-12-24 08:44:20 +01:00

Branches

57349e1db3 · llama : allow overrides for tokenizer flags · Updated 2024-07-21 13:42:15 +02:00

957
1

1932a1b871 · gguf-py : do not use title case for naming convention · Updated 2024-07-20 22:55:06 +02:00

965
5

c8ee1bccdd · Fix Vulkan matmul tests compile errors · Updated 2024-07-20 08:01:18 +02:00

965
1

50d1a035f0 · convert_hf : fix Gemma v1 not setting BOS and EOS tokens · Updated 2024-07-20 04:46:35 +02:00

965
2

38061254b9 · gguf : handle null name during init · Updated 2024-07-19 12:45:00 +02:00

970
1

f6ea7a093c · llama : change fallback type IQ4_NL -> Q4_0 · Updated 2024-07-16 09:00:57 +02:00

986
1

b971122eb1 · convert_hf : fix memory leak in lazy MoE conversion · Updated 2024-07-16 03:11:44 +02:00

988
3

f89eaa921e · pydantic : fix Python 3.9 and 3.10 support · Updated 2024-07-14 03:52:45 +02:00

1002
2

59ce85318a · test-tokenizer-random : reduce potential confilcts with #8379 · Updated 2024-07-13 07:56:05 +02:00

1020
14

ba06b2deb7 · tokenize : add --no-parse-special option · Updated 2024-07-11 00:06:25 +02:00

1020
1

117f7adbd9 · ggml : remove K_QUANTS_PER_ITERATION (#8306) · Updated 2024-07-10 14:23:12 +02:00

1083
7

aaf7bc89e4 · Merge branch 'master' into compilade/gguf-py-fix-old-numpy · Updated 2024-07-09 06:10:06 +02:00

1038
2

86ccd30983 · ci : only show warnings and errors in python type-check · Updated 2024-07-07 20:10:42 +02:00

1053
10

a44f22e7d3 · py : use cpu-only torch in requirements.txt · Updated 2024-07-06 17:18:03 +02:00

1063
1

f55b647300 · llama : minor indentation during tensor loading · Updated 2024-07-04 18:34:04 +02:00

1085
16

dcab343f2f · use 1 seq for kl_divergence · Updated 2024-07-03 16:22:58 +02:00

1100
2

703764a382 · convert : use non-fast T5 tokenizer · Updated 2024-07-02 18:29:26 +02:00

1142
10

d4a1923d4e · minor : remove parentheses · Updated 2024-07-01 13:45:55 +02:00

1120
2

51f0bd50a1 · Remove custom pre attention scaling and use computed value instead. · Updated 2024-06-30 05:02:50 +02:00

1123
10

712e4d9450 · Generate full token count during warm up · Updated 2024-06-28 14:29:00 +02:00

1126
1