torchchat
torchchat copied to clipboard
CLI: Fix unsafe arg access of unused args
As described in Issue 932, the legacy implementation of the arg parser results in subcommands requiring the existence of cli args that they don't actually use.
This PR fixes this by doing a safe getattr check instead of a raw field access.
As a side effect, it also allows us to remove conditional suppression of args. Instead we can just omit the arg from the parser if they aren't needed.
Also happens to solve a --help bug https://github.com/pytorch/torchchat/issues/976
chat
python torchchat.py chat --help
usage: torchchat chat [-h] [--checkpoint-path CHECKPOINT_PATH] [--compile] [--compile-prefill] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
[--dso-path DSO_PATH | --pte-path PTE_PATH] [--max-new-tokens MAX_NEW_TOKENS] [--top-k TOP_K] [--temperature TEMPERATURE] [--hf-token HF_TOKEN] [--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
[model]
options:
-h, --help show this help message and exit
-v, --verbose Verbose output
--seed SEED Initialize torch seed
Model Specification:
(REQUIRED) Specify the base model. Args are mutually exclusive.
model Model name for well-known models
--checkpoint-path CHECKPOINT_PATH
Use the specified model checkpoint path
Model Configuration:
Specify model configurations
--compile Whether to compile the model with torch.compile
--compile-prefill Whether to compile the prefill. Improves prefill perf, but has higher compile times.
--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
--quantize QUANTIZE Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
--device {fast,cpu,cuda,mps}
Hardware device to use. Options: cpu, cuda, mps
Exported Model Path:
Specify the path of the exported model files to ingest
--dso-path DSO_PATH Use the specified AOT Inductor .dso model file
--pte-path PTE_PATH Use the specified ExecuTorch .pte model file
Generation:
Configs for generating output based on provided prompt
--max-new-tokens MAX_NEW_TOKENS
Maximum number of new tokens
--top-k TOP_K Top-k for sampling
--temperature TEMPERATURE
Temperature for sampling
Model Downloading:
Specify args for model downloading (if model is not downloaded)
--hf-token HF_TOKEN A HuggingFace API token to use when downloading model artifacts
--model-directory MODEL_DIRECTORY
The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache
generate
python torchchat.py generate --help
usage: torchchat generate [-h] [--checkpoint-path CHECKPOINT_PATH] [--compile] [--compile-prefill] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
[--dso-path DSO_PATH | --pte-path PTE_PATH] [--prompt PROMPT] [--num-samples NUM_SAMPLES] [--max-new-tokens MAX_NEW_TOKENS] [--top-k TOP_K] [--temperature TEMPERATURE] [--hf-token HF_TOKEN]
[--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
[model]
options:
-h, --help show this help message and exit
-v, --verbose Verbose output
--seed SEED Initialize torch seed
Model Specification:
(REQUIRED) Specify the base model. Args are mutually exclusive.
model Model name for well-known models
--checkpoint-path CHECKPOINT_PATH
Use the specified model checkpoint path
Model Configuration:
Specify model configurations
--compile Whether to compile the model with torch.compile
--compile-prefill Whether to compile the prefill. Improves prefill perf, but has higher compile times.
--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
--quantize QUANTIZE Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
--device {fast,cpu,cuda,mps}
Hardware device to use. Options: cpu, cuda, mps
Exported Model Path:
Specify the path of the exported model files to ingest
--dso-path DSO_PATH Use the specified AOT Inductor .dso model file
--pte-path PTE_PATH Use the specified ExecuTorch .pte model file
Generation:
Configs for generating output based on provided prompt
--prompt PROMPT Input prompt for manual output generation
--num-samples NUM_SAMPLES
Number of samples
--max-new-tokens MAX_NEW_TOKENS
Maximum number of new tokens
--top-k TOP_K Top-k for sampling
--temperature TEMPERATURE
Temperature for sampling
Model Downloading:
Specify args for model downloading (if model is not downloaded)
--hf-token HF_TOKEN A HuggingFace API token to use when downloading model artifacts
--model-directory MODEL_DIRECTORY
The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache
export
python torchchat.py export --help
usage: torchchat export [-h] [--checkpoint-path CHECKPOINT_PATH] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
[--output-pte-path OUTPUT_PTE_PATH | --output-dso-path OUTPUT_DSO_PATH] [--hf-token HF_TOKEN] [--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
[model]
options:
-h, --help show this help message and exit
-v, --verbose Verbose output
--seed SEED Initialize torch seed
Model Specification:
(REQUIRED) Specify the base model. Args are mutually exclusive.
model Model name for well-known models
--checkpoint-path CHECKPOINT_PATH
Use the specified model checkpoint path
Model Configuration:
Specify model configurations
--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
--quantize QUANTIZE Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
--device {fast,cpu,cuda,mps}
Hardware device to use. Options: cpu, cuda, mps
Export Output Path:
Specify the output path for the exported model files
--output-pte-path OUTPUT_PTE_PATH
Output to the specified ExecuTorch .pte model file
--output-dso-path OUTPUT_DSO_PATH
Output to the specified AOT Inductor .dso model file
Model Downloading:
Specify args for model downloading (if model is not downloaded)
--hf-token HF_TOKEN A HuggingFace API token to use when downloading model artifacts
--model-directory MODEL_DIRECTORY
The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache
eval
python torchchat.py eval --help
usage: torchchat eval [-h] [--checkpoint-path CHECKPOINT_PATH] [--compile] [--compile-prefill] [--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}] [--quantize QUANTIZE] [--device {fast,cpu,cuda,mps}]
[--dso-path DSO_PATH | --pte-path PTE_PATH] [--tasks TASKS [TASKS ...]] [--limit LIMIT] [--max-seq-length MAX_SEQ_LENGTH] [--hf-token HF_TOKEN] [--model-directory MODEL_DIRECTORY] [-v] [--seed SEED]
[model]
options:
-h, --help show this help message and exit
-v, --verbose Verbose output
--seed SEED Initialize torch seed
Model Specification:
(REQUIRED) Specify the base model. Args are mutually exclusive.
model Model name for well-known models
--checkpoint-path CHECKPOINT_PATH
Use the specified model checkpoint path
Model Configuration:
Specify model configurations
--compile Whether to compile the model with torch.compile
--compile-prefill Whether to compile the prefill. Improves prefill perf, but has higher compile times.
--dtype {fp32,fp16,bf16,float,half,float32,float16,bfloat16,fast,fast16}
Override the dtype of the model (default is the checkpoint dtype). Options: bf16, fp16, fp32, fast16, fast
--quantize QUANTIZE Quantization options. pass in as '{"<mode>" : {"<argname1>" : <argval1>, "<argname2>" : <argval2>,...},}' modes are: embedding, linear:int8, linear:int4, linear:a8w4dq, precision.
--device {fast,cpu,cuda,mps}
Hardware device to use. Options: cpu, cuda, mps
Exported Model Path:
Specify the path of the exported model files to ingest
--dso-path DSO_PATH Use the specified AOT Inductor .dso model file
--pte-path PTE_PATH Use the specified ExecuTorch .pte model file
Evaluation:
Configs for evaluating model performance
--tasks TASKS [TASKS ...]
List of lm-eluther tasks to evaluate. Usage: --tasks task1 task2
--limit LIMIT Number of samples to evaluate
--max-seq-length MAX_SEQ_LENGTH
Maximum length sequence to evaluate
Model Downloading:
Specify args for model downloading (if model is not downloaded)
--hf-token HF_TOKEN A HuggingFace API token to use when downloading model artifacts
--model-directory MODEL_DIRECTORY
The directory to store downloaded model artifacts. Default: /home/jackkhuu/.torchchat/model-cache