disperaller
disperaller
> How to manually deploy the GPU? i think you need to modify the code to move the everything to GPU
if this is how LLaMa3 was pretrained, then in the sft process, should we include these special tokens (, , etc...), which means to unmask them in the attention_mask?
This issue is addressed by [https://github.com/thunlp/OpenNRE/issues/312](url), however when i tried installing transformers==3.4.0, a problem with Rust came up saying no rust compiler found. Thus, after a little search, the following...
> @Arian-Akbari could you post some example output? I'm not too familiar with modelfusion, but taking a look at the repo. > > One thing that may also help is...
Indeed, I ran into the same issue when running 'from elmoformanylangs import Embedder' could someone help on this?
> [qwen-v1.5-14b-hf/LongBench_vcsum,qwen-v1.5-14b-hf/LongBench_narrativeqa,qwen-v1.5-14b-hf/LongBench_multifieldqa_zh,qwen-v1.5-14b-hf/LongBench_lsht,qwen-v1.5-14b-hf/LongBench_dureader,qwen-v1.5-14b-hf/LongBench_passage_retrieval_zh] For this tasks information, it seems that you use partitioner to allocate 4 tasks on 4 gpus, so if you want to use 4 gpus only for one...
 当使用vllm的时候 不知道为什么一直报timeout 上面部分是模型的设置 下面的是错误 请问是怎么回事啊?
same here, zero2, the -1 gets continued passed to next computation step which expects float tensor, which causes the error
> same here, zero2, the -1 gets continued passed to next computation step which expects float tensor, which causes the error set overlap_comm to False resolve the issue for me.
> This is a weird issue. I'll try it myself very soon. Could you please share your training log? Is the loss for more than 1 epoch normal? Hi i...