Ma Xinyin
Ma Xinyin
Hi @l-dawei, Could you check if [the new implementation](https://github.com/horseee/DeepCache#usage) works?
Hi @allowdoc, Please follow the [requirements](https://github.com/horseee/DeepCache#requirements) to install diffusers and transformers. And the script (stable_diffusion.py or stable_diffusion_xl.py) or the code snippets in [#usage](https://github.com/horseee/DeepCache#usage) would automatically download the stable diffusion models.
Hi, Thanks for your interest in our paper. Currently not, we tested it and it has some bugs in the code. We are currently working on the support of LLM-Pruner...
Hi, The code supports GQA now. Here is the [command](https://github.com/horseee/LLM-Pruner?tab=readme-ov-file#llama-llama3llama31-pruning) and some of the [results](https://github.com/horseee/LLM-Pruner/tree/main/more_results)
> What are the memory requirements to run the code, I face OOM on 40GB A100, if I set my device to `cuda` instead of `cpu` , which is extremely...
Hi, We have not yet conducted testing on Qwen for the current codebase. We will take the time to try it.
Hi, Yes. Cosine similarity is used in paper.
> one more question here, is the similarity calculation based on flattened latent or other ways? Yes. The features are flattened and then the similarity is computed.