add dev notes

brian khuu 2024-06-11 23:04:09 +10:00
parent 2587d2934b
commit 57985f88ca
3 changed files with 81 additions and 33 deletions

@ -28,7 +28,7 @@ This is not definitive, but is helpful when reading sourcecode or console output
## Tensor Encoding Scheme Mapping
| Scheme | `ggml_ftype` C enumeration name | `ggml_type` C enum name | Bits/Weight | Data Type | Block Configuration | Quantized Weight Formula | Initial Commits Or Pull Request Sources (of `ggml_type`) |
| -------- | ------------------------------- | ----------------------- | ----------- | ----------------------------- | ---------------------------------------------------------------------- | ----------------------------------------------- | ------------------------------------------------------------------------ |
|----------|---------------------------------|-------------------------|-------------|-------------------------------|---------------------------------------------------------------------------------------|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
| BF16 | GGML_FTYPE_MOSTLY_BF16 | GGML_TYPE_BF16 | 16 | bfloat16 (trunc 32b IEEE754) | Homogonous Array Of Floating Weights | - | [llama.cpp PR: Introduce bfloat16 support #6412](https://github.com/ggerganov/llama.cpp/pull/6412) |
| F16 | GGML_FTYPE_MOSTLY_F16 | GGML_TYPE_F16 | 16 | 16-bit IEEE 754 | Homogonous Array Of Floating Weights | - | [llama.cpp CM: Initial Release](https://github.com/ggerganov/llama.cpp/commit/26c084662903ddaca19bef982831bfb0856e8257) |
| F32 | GGML_FTYPE_ALL_F32 | GGML_TYPE_F32 | 32 | 32-bit IEEE 754 | Homogonous Array Of Floating Weights | - | [llama.cpp CM: Initial Release](https://github.com/ggerganov/llama.cpp/commit/26c084662903ddaca19bef982831bfb0856e8257) |

@ -19,9 +19,10 @@ Useful information for users that doesn't fit into Readme.
These are information useful for Maintainers and Developers which does not fit into code comments
* [[Tensor-Encoding-Schemes]]
* [[Tensor Encoding Schemes]]
* [[Terminology]]
* [[PR And Issue Tickets Maintenance]]
* [[Dev Notes]]
# Github Actions Main Branch Status

47
dev-notes.md Normal file

@ -0,0 +1,47 @@
# Dev Note
These are general free form note with pointers to good jumping to point to under
stand the llama.cpp codebase.
(`@<symbol>` is a vscode jump to symbol code for your convenience. [Also making a feature request to vscode to be able to jump to file and symbol](https://github.com/microsoft/vscode/issues/214870))
## Where are the definitions?
[GGUF file structure spec (WARN: As of 2024-06-11 the llama.cpp implementation is the canonical source for now)](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#file-structure)
All of the gguf structure can be found in `gguf.c` unless stated otherwise
| GGUF Structure Of Interest | gguf.c reference | vscode search line |
|----------------------------|---------------------------|---------------------|
| Overall File Structure | `struct gguf_context` | `@gguf_context` |
| File Header Structure | `struct gguf_header` | `@gguf_header` |
| Key Value Structure | `struct gguf_kv` | `@gguf_kv` |
| Tensor Info Structure | `struct gguf_tensor_info` | `@gguf_tensor_info` |
### Element of Interest (Think of this as an index lookup reference)
Please use this as an index not as canonical reference.
The purpose of this table is to allow you to quickly locate major elements of
the gguf file standard.
| GGUF Elements Of Interest | c name | c type | gguf.c reference | vscode search line |
|-------------------------------------------------------|-------------------------|---------------------------|---------------------------|---------------------|
| Magic | magic | `uint8_t[4]` | `struct gguf_header` | `@gguf_header` |
| Version | version | `uint32_t` | `struct gguf_header` | `@gguf_header` |
| Tensor Count | n_tensors | `uint64_t` | `struct gguf_header` | `@gguf_header` |
| Key Value Count | n_kv | `uint64_t` | `struct gguf_header` | `@gguf_header` |
| Key Value Linked List | kv | `gguf_kv *` | `struct gguf_context` | `@gguf_context` |
| Tensor Info Linked List | infos | `gguf_tensor_info *` | `struct gguf_context` | `@gguf_context` |
| Key Value Entry - Key | gguf_kv.key | `gguf_str` | `struct gguf_kv` | `@gguf_kv` |
| Key Value Entry - Type | gguf_kv.type | `gguf_type` | `struct gguf_kv` | `@gguf_kv` |
| Key Value Entry - Type | gguf_kv.value | `gguf_value` | `struct gguf_kv` | `@gguf_kv` |
| Tensor Info Entry - Name | gguf_tensor_info.name | `gguf_str` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
| Tensor Info Entry - Tensor shape dimension count | gguf_tensor_info.n_dim | `uint32_t` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
| Tensor Info Entry - Tensor shape sizing array | gguf_tensor_info.ne | `uint64_t[GGML_MAX_DIMS]` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
| Tensor Info Entry - Tensor Encoding Scheme / Strategy | gguf_tensor_info.type | `ggml_type` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
| Tensor Info Entry - Offset from start of 'data' | gguf_tensor_info.offset | `uint64_t` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
| Alignment | alignment | `size_t` | `struct gguf_context` | `@gguf_context` |
| Offset Of 'Data' From Beginning Of File | offset | `size_t` | `struct gguf_context` | `@gguf_context` |
| Size Of 'Data' In Bytes | size | `size_t` | `struct gguf_context` | `@gguf_context` |