mirror of
https://github.com/ggerganov/llama.cpp.git
synced 2024-11-22 16:27:58 +01:00
add dev notes
parent
2587d2934b
commit
57985f88ca
@ -28,7 +28,7 @@ This is not definitive, but is helpful when reading sourcecode or console output
|
|||||||
## Tensor Encoding Scheme Mapping
|
## Tensor Encoding Scheme Mapping
|
||||||
|
|
||||||
| Scheme | `ggml_ftype` C enumeration name | `ggml_type` C enum name | Bits/Weight | Data Type | Block Configuration | Quantized Weight Formula | Initial Commits Or Pull Request Sources (of `ggml_type`) |
|
| Scheme | `ggml_ftype` C enumeration name | `ggml_type` C enum name | Bits/Weight | Data Type | Block Configuration | Quantized Weight Formula | Initial Commits Or Pull Request Sources (of `ggml_type`) |
|
||||||
| -------- | ------------------------------- | ----------------------- | ----------- | ----------------------------- | ---------------------------------------------------------------------- | ----------------------------------------------- | ------------------------------------------------------------------------ |
|
|----------|---------------------------------|-------------------------|-------------|-------------------------------|---------------------------------------------------------------------------------------|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| BF16 | GGML_FTYPE_MOSTLY_BF16 | GGML_TYPE_BF16 | 16 | bfloat16 (trunc 32b IEEE754) | Homogonous Array Of Floating Weights | - | [llama.cpp PR: Introduce bfloat16 support #6412](https://github.com/ggerganov/llama.cpp/pull/6412) |
|
| BF16 | GGML_FTYPE_MOSTLY_BF16 | GGML_TYPE_BF16 | 16 | bfloat16 (trunc 32b IEEE754) | Homogonous Array Of Floating Weights | - | [llama.cpp PR: Introduce bfloat16 support #6412](https://github.com/ggerganov/llama.cpp/pull/6412) |
|
||||||
| F16 | GGML_FTYPE_MOSTLY_F16 | GGML_TYPE_F16 | 16 | 16-bit IEEE 754 | Homogonous Array Of Floating Weights | - | [llama.cpp CM: Initial Release](https://github.com/ggerganov/llama.cpp/commit/26c084662903ddaca19bef982831bfb0856e8257) |
|
| F16 | GGML_FTYPE_MOSTLY_F16 | GGML_TYPE_F16 | 16 | 16-bit IEEE 754 | Homogonous Array Of Floating Weights | - | [llama.cpp CM: Initial Release](https://github.com/ggerganov/llama.cpp/commit/26c084662903ddaca19bef982831bfb0856e8257) |
|
||||||
| F32 | GGML_FTYPE_ALL_F32 | GGML_TYPE_F32 | 32 | 32-bit IEEE 754 | Homogonous Array Of Floating Weights | - | [llama.cpp CM: Initial Release](https://github.com/ggerganov/llama.cpp/commit/26c084662903ddaca19bef982831bfb0856e8257) |
|
| F32 | GGML_FTYPE_ALL_F32 | GGML_TYPE_F32 | 32 | 32-bit IEEE 754 | Homogonous Array Of Floating Weights | - | [llama.cpp CM: Initial Release](https://github.com/ggerganov/llama.cpp/commit/26c084662903ddaca19bef982831bfb0856e8257) |
|
||||||
|
@ -19,9 +19,10 @@ Useful information for users that doesn't fit into Readme.
|
|||||||
|
|
||||||
These are information useful for Maintainers and Developers which does not fit into code comments
|
These are information useful for Maintainers and Developers which does not fit into code comments
|
||||||
|
|
||||||
* [[Tensor-Encoding-Schemes]]
|
* [[Tensor Encoding Schemes]]
|
||||||
* [[Terminology]]
|
* [[Terminology]]
|
||||||
* [[PR And Issue Tickets Maintenance]]
|
* [[PR And Issue Tickets Maintenance]]
|
||||||
|
* [[Dev Notes]]
|
||||||
|
|
||||||
# Github Actions Main Branch Status
|
# Github Actions Main Branch Status
|
||||||
|
|
||||||
|
47
dev-notes.md
Normal file
47
dev-notes.md
Normal file
@ -0,0 +1,47 @@
|
|||||||
|
# Dev Note
|
||||||
|
|
||||||
|
These are general free form note with pointers to good jumping to point to under
|
||||||
|
stand the llama.cpp codebase.
|
||||||
|
|
||||||
|
(`@<symbol>` is a vscode jump to symbol code for your convenience. [Also making a feature request to vscode to be able to jump to file and symbol](https://github.com/microsoft/vscode/issues/214870))
|
||||||
|
|
||||||
|
|
||||||
|
## Where are the definitions?
|
||||||
|
|
||||||
|
[GGUF file structure spec (WARN: As of 2024-06-11 the llama.cpp implementation is the canonical source for now)](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md#file-structure)
|
||||||
|
|
||||||
|
All of the gguf structure can be found in `gguf.c` unless stated otherwise
|
||||||
|
|
||||||
|
| GGUF Structure Of Interest | gguf.c reference | vscode search line |
|
||||||
|
|----------------------------|---------------------------|---------------------|
|
||||||
|
| Overall File Structure | `struct gguf_context` | `@gguf_context` |
|
||||||
|
| File Header Structure | `struct gguf_header` | `@gguf_header` |
|
||||||
|
| Key Value Structure | `struct gguf_kv` | `@gguf_kv` |
|
||||||
|
| Tensor Info Structure | `struct gguf_tensor_info` | `@gguf_tensor_info` |
|
||||||
|
|
||||||
|
|
||||||
|
### Element of Interest (Think of this as an index lookup reference)
|
||||||
|
|
||||||
|
Please use this as an index not as canonical reference.
|
||||||
|
The purpose of this table is to allow you to quickly locate major elements of
|
||||||
|
the gguf file standard.
|
||||||
|
|
||||||
|
| GGUF Elements Of Interest | c name | c type | gguf.c reference | vscode search line |
|
||||||
|
|-------------------------------------------------------|-------------------------|---------------------------|---------------------------|---------------------|
|
||||||
|
| Magic | magic | `uint8_t[4]` | `struct gguf_header` | `@gguf_header` |
|
||||||
|
| Version | version | `uint32_t` | `struct gguf_header` | `@gguf_header` |
|
||||||
|
| Tensor Count | n_tensors | `uint64_t` | `struct gguf_header` | `@gguf_header` |
|
||||||
|
| Key Value Count | n_kv | `uint64_t` | `struct gguf_header` | `@gguf_header` |
|
||||||
|
| Key Value Linked List | kv | `gguf_kv *` | `struct gguf_context` | `@gguf_context` |
|
||||||
|
| Tensor Info Linked List | infos | `gguf_tensor_info *` | `struct gguf_context` | `@gguf_context` |
|
||||||
|
| Key Value Entry - Key | gguf_kv.key | `gguf_str` | `struct gguf_kv` | `@gguf_kv` |
|
||||||
|
| Key Value Entry - Type | gguf_kv.type | `gguf_type` | `struct gguf_kv` | `@gguf_kv` |
|
||||||
|
| Key Value Entry - Type | gguf_kv.value | `gguf_value` | `struct gguf_kv` | `@gguf_kv` |
|
||||||
|
| Tensor Info Entry - Name | gguf_tensor_info.name | `gguf_str` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
|
||||||
|
| Tensor Info Entry - Tensor shape dimension count | gguf_tensor_info.n_dim | `uint32_t` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
|
||||||
|
| Tensor Info Entry - Tensor shape sizing array | gguf_tensor_info.ne | `uint64_t[GGML_MAX_DIMS]` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
|
||||||
|
| Tensor Info Entry - Tensor Encoding Scheme / Strategy | gguf_tensor_info.type | `ggml_type` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
|
||||||
|
| Tensor Info Entry - Offset from start of 'data' | gguf_tensor_info.offset | `uint64_t` | `struct gguf_tensor_info` | `@gguf_tensor_info` |
|
||||||
|
| Alignment | alignment | `size_t` | `struct gguf_context` | `@gguf_context` |
|
||||||
|
| Offset Of 'Data' From Beginning Of File | offset | `size_t` | `struct gguf_context` | `@gguf_context` |
|
||||||
|
| Size Of 'Data' In Bytes | size | `size_t` | `struct gguf_context` | `@gguf_context` |
|
Loading…
Reference in New Issue
Block a user