Chris
Chris
I think the tests might need some tweaking (otherwise i might have broken them with a few to many pushes :D) Will leave this as is for review, and will...
> Hi @FlimFlamm! Thanks for working on this. > > I had this partially implemented but never pushed it and I might have lost it because I cannot find it...
Pushed some additions and changes that seemed sensical or cleaner. Made a simple padding function for utils (can do left and right padding), and set up the mask cache to...
> Thanks for working on this @FlimFlamm, I was working on this functionality on my fork as well but the kv cache issue is a tricky one. > > I...
> I just did the following: > > ```shell > python scripts/download.py --repo_id 'TinyLlama/TinyLlama-1.1B-Chat-v1.0' --from_safetensors 1 > python scripts/convert_hf_checkpoint.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > python generate/base.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > ``` > >...
> I just did the following: > > ```shell > python scripts/download.py --repo_id 'TinyLlama/TinyLlama-1.1B-Chat-v1.0' --from_safetensors 1 > python scripts/convert_hf_checkpoint.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > python generate/base.py --checkpoint_dir 'checkpoints/TinyLlama/TinyLlama-1.1B-Chat-v1.0' > ``` > >...
Applying the l[arge-negative number fix](https://github.com/pytorch/pytorch/issues/103749) seems to have done the trick; left and right padding now are both equivalent in terms of output for the tinyllama chat modell.
> Running benchmarks on TinyLlama using the original generate code vs your batch implemetation yields (almost) identical scores now. Well done! > > **Update** Spoke too soon.. when batching 10...
@WilliamGazeley Thanks for the effort on this! > What are you getting on your end, is doing 10 prompts in a batch the same as the same 10 prompts one...
> About correctness: did you get the metadata from the same source as the index? As far as I can tell, they're the correct ones. Following the [H-14 guide](https://github.com/rom1504/clip-retrieval/blob/main/docs/laion5B_h14_back.md) [This...