LLaDA
LLaDA copied to clipboard
Some questions about data processing
Thank you for your excellent work.
I have some questions about data processing and look forward to your response.
- Do you use the same data processing scheme as SMDM, that is, packing and padding operations?
- And in this operation, are the samples visible to each other? There seems to be no attn mask in the data processing in SMDM.
Thank you for your attention!
- Yes.
- Yes.