633WHU
633WHU
> I would suggest using vertex embedding alone, or concating the two embeddings. @KiddoZhu what is the difference between vertex embedding and context embedding? Is it about first order and...
> > One vector is called vertex representation, the other is called context representation. This is mainly borrowed from the [distributional hypothesis](https://en.wikipedia.org/wiki/Distributional_semantics#Distributional_hypothesis) in NLP. > > Empirically it works better...
> I tried to replicate this experiment, and here are the results that I've observed: > > nthreads time > 1 15.56 > 2 9.44 > 3 6.43 > 4...
after convertion, in csr, node ids are from 1 to 10312? to get the original node ids, we need to sort the original node ids? for example, if original node...
@byshiue do we have timeline for m2m100?
At present, we have found a workaround and set the swap space directly to 0. This way, we will not call the CPU swap space and will not report any...
At present, we have found a workaround and set the swap space directly to 0. This way, we will not call the CPU swap space and will not report any...
> this issue makes vllm impossible for production use At present, we have found a workaround and set the swap space directly to 0. This way, we will not call...
I met the same problem, and I think it is caused by normalize. after normalize, the name become different. I think there is some bugs of normalize.cpp, we can skip...
@guanqun-yang @young-geng did you use the few-shot zero? I used the default 0 fewshot, the results is almost same.