llm-foundry
llm-foundry copied to clipboard
Converted PrefixLM HF snapshot must enable cache for generation in config
Environment
llmfoundry:latest
To reproduce
Steps to reproduce the behavior:
- Train a prefix-lm
- Convert it to Huggingface via
llm-foundry/scripts/inference/convert_composer_to_hf.py - Try to generate texts with the HF snapshot
-> When generating, the model throws an exception that use_cache must be enabled in the HF config.
Expected behavior
Model generates output texts. Manually editing the HF config and enabling the cache did the trick for me.
use_cache is something that you can specify at model load time as a kwarg, or at generation time as a kwarg. We can probably make some adjustments to make this more automatic. Thanks!
Thanks for clarifying.
The key argument for having it set to true in the config by default is that generating with use_cache=false results in an exception anyway.
I guess the easiest way would be to adjust the conversion script accordingly.