finetune : print sample-start/include-sample-start (#5072)

This commit adds `--sample-start` and `--include-sample-start` to the
output from the main function in finetune.cpp.

The motivation for this is that even though these are set explicitly by
the user via the command line, if one forgets to set them then it is
useful to have their values printed out. Otherwise it is possible to go
through the whole training process before realizing that the values are
not what one expected.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
This commit is contained in:
Daniel Bevenius 2024-01-22 12:11:01 +01:00 committed by GitHub
parent 66d575c45c
commit 152d9d05e0
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -1800,6 +1800,8 @@ int main(int argc, char ** argv) {
std::vector<size_t> train_samples_begin; std::vector<size_t> train_samples_begin;
std::vector<size_t> train_samples_size; std::vector<size_t> train_samples_size;
printf("%s: tokenize training data from %s\n", __func__, params.common.fn_train_data); printf("%s: tokenize training data from %s\n", __func__, params.common.fn_train_data);
printf("%s: sample-start: %s\n", __func__, params.common.sample_start.c_str());
printf("%s: include-sample-start: %s\n", __func__, params.common.include_sample_start ? "true" : "false");
tokenize_file(lctx, tokenize_file(lctx,
params.common.fn_train_data, params.common.fn_train_data,
params.common.sample_start, params.common.sample_start,