z_

Results 15 comments of z_

I took the following procedures to make the webui work offline. 1. download and put taming-transformers, stable-diffusion, k-diffusion, GFPGAN, CodeFormer, CLIP, BLIP github repos to ./repositories. 2. download openai/clip-vit-large-patch14 from...

Why is the amount of communication between nodes M/N? After all, each node needs to get the parameters on all other nodes, which looks like M * (N - 1)...

I tried to generate a universal checkpoint for bloomz-7b on 8 x a100 40G using the following method. 1. using deepspeed_to_deepspeed.py, convert the data parallel of the checkpoint to 8....

I solved this by adding model[0].optimizer.refresh_fp32_params() right before this line: https://github.com/bigscience-workshop/Megatron-DeepSpeed/blob/e52bdabbde3c6895aceb76c1bced295c2646121f/megatron/checkpointing.py#L285

I met the same problem with bloomz model https://huggingface.co/bigscience/bloomz-mt on ghcr.io/huggingface/text-generation-inference:0.6

请问为什么要把mask_feat_head替换成mask_head呢?

dataset是一个内部数据集。之后我看看能不能找一个开源的替代下。att.apply我这没问题,可以粘贴下具体错误。

> > To be compatible with OpenAI, you can also use https://api.deepseek.com/v1 as the base_url. But note that the v1 here has NO relationship with the model's version. > >...

![image](https://github.com/anc95/writely/assets/6761483/e012cc57-26c6-4557-abc8-212f5685ca59) v1/chat和chat的返回是一样的,似乎都是因为请求传的不是chat接口需要的messages,而是prompt。