sgsdxzy comments

Results 117 comments of


                                            sgsdxzy

cuDNN benchmark for minor speed boost?

Here's my results on 3080Ti, fig size 512x512, eular a 30 steps, batch size=8: | settings | it/s | | --- | --- | | default | 2.10 | |...

Errors when using the grammar feature (llamacpp_hf)

I met similar problems. I think this is probably caused by tokenizer config adding extra tokens and not handled correctly.

Automatic model parallel inference by deepspeed

@oobabooga Update: It seems I have to load the whole model for every process and let it chunk (before I split load the model to multiple GPUs so each process...

Automatic model parallel inference by deepspeed

@oobabooga Here are some mixed news, and still very interesting: First, I managed to get GPT-Neo and OPT to work. In fact the kernel support list includes most types textui...

Automatic model parallel inference by deepspeed

@huangjiaheng 你可以用中文，我看得懂 It seems your translation software is cutting off sentences. If you struggle with English you can use Chinese.

Automatic model parallel inference by deepspeed

Update: I can get split loading to work according to example https://github.com/huggingface/transformers-bloom-inference/blob/e970be1027afc43c147d06153635f4285c517081/bloom-inference-scripts/bloom-ds-inference.py but int8 and llama is still not working yet

sgsdxzy

cuDNN benchmark for minor speed boost?

Errors when using the grammar feature (llamacpp_hf)

Automatic model parallel inference by deepspeed

Automatic model parallel inference by deepspeed

Automatic model parallel inference by deepspeed

Automatic model parallel inference by deepspeed

Automatic model parallel inference by deepspeed

Add support for the latest GPTQ models with group-size

Unable to load lora for large models

Unable to load lora for large models