llm-foundry Converted PrefixLM HF snapshot must enable cache for generation in config

Converted PrefixLM HF snapshot must enable cache for generation in config

Open timsteuer opened this issue 1 year ago • 2 comments

Environment

llmfoundry:latest

To reproduce

Steps to reproduce the behavior:

Train a prefix-lm
Convert it to Huggingface via llm-foundry/scripts/inference/convert_composer_to_hf.py
Try to generate texts with the HF snapshot

-> When generating, the model throws an exception that use_cache must be enabled in the HF config.

Expected behavior

Model generates output texts. Manually editing the HF config and enabling the cache did the trick for me.

Dec 06 '23 08:12 timsteuer

use_cache is something that you can specify at model load time as a kwarg, or at generation time as a kwarg. We can probably make some adjustments to make this more automatic. Thanks!

Dec 06 '23 21:12 dakinggg

Thanks for clarifying.

The key argument for having it set to true in the config by default is that generating with use_cache=false results in an exception anyway.

I guess the easiest way would be to adjust the conversion script accordingly.

Dec 07 '23 08:12 timsteuer

llm-foundry llm-foundry copied to clipboard

Converted PrefixLM HF snapshot must enable cache for generation in config

Environment

To reproduce

Expected behavior

llm-foundry
llm-foundry copied to clipboard