llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

Converted PrefixLM HF snapshot must enable cache for generation in config

Open timsteuer opened this issue 1 year ago • 2 comments

Environment

  • llmfoundry:latest

To reproduce

Steps to reproduce the behavior:

  1. Train a prefix-lm
  2. Convert it to Huggingface via llm-foundry/scripts/inference/convert_composer_to_hf.py
  3. Try to generate texts with the HF snapshot

-> When generating, the model throws an exception that use_cache must be enabled in the HF config.

Expected behavior

Model generates output texts. Manually editing the HF config and enabling the cache did the trick for me.

timsteuer avatar Dec 06 '23 08:12 timsteuer

use_cache is something that you can specify at model load time as a kwarg, or at generation time as a kwarg. We can probably make some adjustments to make this more automatic. Thanks!

dakinggg avatar Dec 06 '23 21:12 dakinggg

Thanks for clarifying.

The key argument for having it set to true in the config by default is that generating with use_cache=false results in an exception anyway.

I guess the easiest way would be to adjust the conversion script accordingly.

timsteuer avatar Dec 07 '23 08:12 timsteuer