Jlama icon indicating copy to clipboard operation
Jlama copied to clipboard

Add getDirty to TensorCache avoiding clearing tensors when not needed

Open edwardcapriolo opened this issue 3 months ago • 1 comments

The global static TensorCache has a get method, which attempts to return a tensor of same type and shape from the cache. Tensors are always cleared when they are released to the cache. However there are many cases, maybe most of them where we do not need to clear the tensor as the next user will typically overwrite it entirely:

{noformat} Response{responseText='The best thing to do is to look for the plant that best suits your needs. Avocados are a type of fruit that are grown in the Americas, specifically in Mexico, Central America, and South America. They are known for their creamy, buttery texture and rich, nutty flavor.', responseTextWithSpecialTokens='The best thing to do is to look for the plant that best suits your needs. Avocados are a type of fruit that are grown in the Americas, specifically in Mexico, Central America, and South America. They are known for their creamy, buttery texture and rich, nutty flavor.', finishReason=STOP_TOKEN, promptTokens=65, generatedTokens=64, promptTimeMs=10018, generateTimeMs=10779} tensorcache.dirtyget 130 tensorcache.get 111609 tensorcache.get.hit 111582 tensorcache.getdirty.hit 126 {noformat}

I only put the method in place in a couple of places as I dont have enough knoweldge to put it in place everywhere:

https://github.com/edwardcapriolo/deliverance/pull/4

edwardcapriolo avatar Oct 19 '25 14:10 edwardcapriolo

@tjake I cant imagine the failed test have anything to do with the PR

edwardcapriolo avatar Oct 22 '25 11:10 edwardcapriolo