From 736f61610b5ec6ac4a8a6232fbc964787de9817f Mon Sep 17 00:00:00 2001 From: oobabooga <112222186+oobabooga@users.noreply.github.com> Date: Sat, 4 Mar 2023 01:33:52 -0300 Subject: [PATCH] Update README --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index c2e6b928..f6c03915 100644 --- a/README.md +++ b/README.md @@ -145,6 +145,7 @@ Optionally, you can use the following command-line flags: | `--flexgen` | Enable the use of FlexGen offloading. | | `--percent PERCENT [PERCENT ...]` | FlexGen: allocation percentages. Must be 6 numbers separated by spaces (default: 0, 100, 100, 0, 100, 0). | | `--compress-weight` | FlexGen: Whether to compress weight (default: False).| +| `--pin-weight [PIN_WEIGHT]` | FlexGen: whether to pin weights (setting this to False reduces CPU memory by 20%). | | `--deepspeed` | Enable the use of DeepSpeed ZeRO-3 for inference via the Transformers integration. | | `--nvme-offload-dir NVME_OFFLOAD_DIR` | DeepSpeed: Directory to use for ZeRO-3 NVME offloading. | | `--local_rank LOCAL_RANK` | DeepSpeed: Optional argument for distributed setups. |