LLaDA
LLaDA copied to clipboard
Does LLADA support batch inference now?
Should this “1” be batch_size?
Sorry, I sincerely apologize for not providing code related to batch inference at present. If you intend to perform batch inference, it might be necessary for you to debug the design of the attention mask on your own.
Quick follow up on this: why should 1 not be batch_size? If you have suggestions on how to update this, we'd be happy to debug this issue ourselves and suggest a fix.