ptp
ptp copied to clipboard
About the obj tag and text prompt
Hello, thanks for your sharing the great work!
As we can see the eq.(1), the object tag is produced by a argmax operation, while the paper shows "we select one O at random for each time" in Sec 3.1.2. So there is a doubt: when the object tag is firstly determined, how to judge such a situation ? (" For a certain P, we may have various options for O because the block may contain multiple objects.")
Looking forward for your reply! Thanks😁!