gemma icon indicating copy to clipboard operation
gemma copied to clipboard

Add/156 dynamic cache resize

Open theprashasst opened this issue 9 months ago • 0 comments

Fixes #156 This PR addresses the issue with the resize_cache method in the ChatSampler class, ensuring that the cache tensor is properly resized when the cache length is updated. The changes include:

Addition of resize_cache Function:

A new function resize_cache is introduced to handle resizing of cache with the help of resize_tesnsor. This function ensures that the cache tensor is resized correctly, either by padding with zeros or truncating as needed. Modification of resize_cache Method:

The resize_cache method is added along with the resize_tensor function for resizing the cache tensors. The method now initializes a new cache with the updated size and copies the resized cache data into the new cache. A new SamplingState is created with the updated cache and other fields copied from the existing last_state. The last_state is updated with the new SamplingState. The cached sampler is invalidated so that it gets recomputed with the new cache size.

theprashasst avatar Mar 16 '25 12:03 theprashasst