Ma Xinyin comments

Results 58 comments of


                                            Ma Xinyin

It is not work when using the base model in safetensors format.

Hi @l-dawei, Could you check if [the new implementation](https://github.com/horseee/DeepCache#usage) works?

How to use Actually in local pc

Hi @allowdoc, Please follow the [requirements](https://github.com/horseee/DeepCache#requirements) to install diffusers and transformers. And the script (stable_diffusion.py or stable_diffusion_xl.py) or the code snippets in [#usage](https://github.com/horseee/DeepCache#usage) would automatically download the stable diffusion models.

Adaptation of GQA

Hi, Thanks for your interest in our paper. Currently not, we tested it and it has some bugs in the code. We are currently working on the support of LLM-Pruner...

Adaptation of GQA

Hi, The code supports GQA now. Here is the [command](https://github.com/horseee/LLM-Pruner?tab=readme-ov-file#llama-llama3llama31-pruning) and some of the [results](https://github.com/horseee/LLM-Pruner/tree/main/more_results)

Adaptation of GQA

> What are the memory requirements to run the code, I face OOM on 40GB A100, if I set my device to `cuda` instead of `cpu` , which is extremely...

I would like to ask if the current version is suitable for qwen.

Hi, We have not yet conducted testing on Qwen for the current codebase. We will take the time to try it.

Similarity Metric used in the paper

Hi, Yes. Cosine similarity is used in paper.

Similarity Metric used in the paper

> one more question here, is the similarity calculation based on flattened latent or other ways? Yes. The features are flattened and then the similarity is computed.