Yong Shan comments

Results 19 comments of


                                            Yong Shan

[BUG] cpu_adam warning

Hi, I have verified that the system installed cuda version matches the torch cuda, both cu118. I haven't set `DS_SKIP_CUDA_CHECK`.

[BUG] cpu_adam warning

Sorry for the late reply. There is no other cuda installations, no prints from your referenced function. I still don't know what causes this warning. However, my code runs successfully.

Add attention mask support to blocksparse

@blefaudeux Hi, how to implement a blocksparse attention supporting attention mask (i.e. shape SxS)? I want to implemnt a sparse attention with specific layout. However, current blocksparse attention only use...

Multiwoz evaluation

The same question. @TonyNemo

How to reproduce the results on multiwoz2.0 reported in your paper using the provided checkpoint?

@TonyNemo

The context used in evaluation

The same question. @TonyNemo

Question about the knowledge base?

Hi, I met the same problem. Do you have any answers? @Helicqin

Multi-gpu training example?

@bliu3650 Can you share the command you used?

Any plan for Lion optimizer support?

Does anyone want to implement Lion for apex?