Updated Tensor Encoding Schemes (markdown)

Brian 2024-07-31 22:38:28 +10:00
parent c5ee522ea7
commit 584c3525b5

@ -127,7 +127,5 @@ The 12 bytes in Q4_K `.scales` are packed a bit like this, where the uppercased
11: hhhhHHHH 11: hhhhHHHH
``` ```
Note that this is packing a 6bit scale and mins but split across multiple bytes. This use of byte offsets and bitwise operations is likely done to be more friendlier to parallel processing. Note that this is packing a 6bit scale and mins but split across multiple bytes. This use of byte offsets and bitwise operations is likely done to be more friendlier for SIMD processing. As [compilade](https://github.com/compilade) noted, he believes that the indexing is only done at the byte level, hence the packing and unpacking of the 6-bit values in this block will require bitwise operations. In his anecdotal experience he also noticed that when making the vec_dot of Q1_3, that shuffles are surprisingly as fast as additions in SIMD.