Chris

Results 22 comments of Chris

I think the tests might need some tweaking (otherwise i might have broken them with a few to many pushes :D) Will leave this as is for review, and will...

> Hi @FlimFlamm! Thanks for working on this. > > I had this partially implemented but never pushed it and I might have lost it because I cannot find it...

Pushed some additions and changes that seemed sensical or cleaner. Made a simple padding function for utils (can do left and right padding), and set up the mask cache to...

> Thanks for working on this @FlimFlamm, I was working on this functionality on my fork as well but the kv cache issue is a tricky one. > > I...

> I just did the following: > > ```shell > python scripts/download.py --repo_id 'TinyLlama/TinyLlama-1.1B-Chat-v1.0' --from_safetensors 1 > python scripts/convert_hf_checkpoint.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > python generate/base.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > ``` > >...

> I just did the following: > > ```shell > python scripts/download.py --repo_id 'TinyLlama/TinyLlama-1.1B-Chat-v1.0' --from_safetensors 1 > python scripts/convert_hf_checkpoint.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > python generate/base.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > ``` > >...

Applying the l[arge-negative number fix](https://github.com/pytorch/pytorch/issues/103749) seems to have done the trick; left and right padding now are both equivalent in terms of output for the tinyllama chat modell.

> Running benchmarks on TinyLlama using the original generate code vs your batch implemetation yields (almost) identical scores now. Well done! > > **Update** Spoke too soon.. when batching 10...

@WilliamGazeley Thanks for the effort on this! > What are you getting on your end, is doing 10 prompts in a batch the same as the same 10 prompts one...

> About correctness: did you get the metadata from the same source as the index? As far as I can tell, they're the correct ones. Following the [H-14 guide](https://github.com/rom1504/clip-retrieval/blob/main/docs/laion5B_h14_back.md) [This...