llama-run : fix context size (#11094)

Set `n_ctx` equal to `n_batch` in `Opt` class. Now context size is
a more reasonable 2048.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>
This commit is contained in:
Eric Curtin 2025-01-06 22:45:28 +00:00 committed by GitHub
parent ecebbd292d
commit dc7cef9f37
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -83,6 +83,7 @@ class Opt {
}
ctx_params.n_batch = context_size >= 0 ? context_size : context_size_default;
ctx_params.n_ctx = ctx_params.n_batch;
model_params.n_gpu_layers = ngl >= 0 ? ngl : ngl_default;
temperature = temperature >= 0 ? temperature : temperature_default;