text-generation-webui/System-requirements.md at 9c066601f52ae79fbacfe91c920e7914d1c456a1

mirror of https://github.com/oobabooga/text-generation-webui.git synced 2024-12-25 13:58:56 +01:00

oobabooga 80ef7c7bcb

2023-04-22 02:34:13 -03:00

These are the VRAM and RAM requirements (in MiB) to run some examples of models in 16-bit (default) precision:

Allows you to load models that would not normally fit into your GPU. Enabled by default for 13b and 20b models in this web UI.

model	VRAM (GPU)	RAM
opt-13b	12528.1	1152.39
gpt-neox-20b	20384	2291.7

A lot slower, but does not require a GPU.

On my i5-12400F, 6B models take around 10-20 seconds to respond in chat mode, and around 5 minutes to generate a 200 tokens completion.