Add/156 dynamic cache resize
Fixes #156 This PR addresses the issue with the resize_cache method in the ChatSampler class, ensuring that the cache tensor is properly resized when the cache length is updated. The changes include:
Addition of resize_cache Function:
A new function resize_cache is introduced to handle resizing of cache with the help of resize_tesnsor. This function ensures that the cache tensor is resized correctly, either by padding with zeros or truncating as needed. Modification of resize_cache Method:
The resize_cache method is added along with the resize_tensor function for resizing the cache tensors. The method now initializes a new cache with the updated size and copies the resized cache data into the new cache. A new SamplingState is created with the updated cache and other fields copied from the existing last_state. The last_state is updated with the new SamplingState. The cached sampler is invalidated so that it gets recomputed with the new cache size.