AutoCompressors
AutoCompressors copied to clipboard
The usage of past_key_values in AutoCompressorMixin
Hello authors. There is no doubt that AutoCompressors is an excellent work.
I have carefully read and learned the mainly designed codes in auto_compressor.py and noticed that the class AutoCompressorMixin contains quite a bit of codes for processing past_key_values (including softprompt in it). But i checked the intermediate values of past_key_values in forward() during training and found it seems always be None.
That confuses me and I have question that whether the processing of past_key_values is redundant in your code for aligning standard interfaces for CasualLM , or for other possible purposes?If not, I'm curious about what the situation is past_key_values!=None during model forwarding ?
I re-read and further test the code again, and found that when using model.generate() for inference, use_cache=True will be default chosen. In this case, model.generate() will recursively pass past_key_values to model.forward() when generating output sequence token-by-token.
So far, in my understanding, the past_key_values only serves inference with use_cache and will not be used during training or just extract softprompt using model(input_ids, output_softprompt=True).softprompt.
However, I still hope that my opinions can be reviewed by the authors to prevent myself from misunderstanding the design of the code.
Thanks.