STEVE-1 Some questions of interest regarding the details of Prior training.

Some questions of interest regarding the details of Prior training.

Open Zhoues opened this issue 1 year ago • 0 comments

In the Appendix D.2 section of the paper, the Prior Training section, I understood how Steve-1 collected text-video pairs for training the Prior. I am particularly interested in two points 😄 :

I am curious about how I can obtain the 2000 hand-labeled text examples/10000 augmented text examples because I want to try to have the Steve-1 Agent perform some tasks that are trained but not among those 11 tasks.
How can I use mineclip to retrieve videos, is there a script for this? I am curious about how the offset operation mentioned in the paper is smoothly implemented.

Looking forward to your reply ❤️ @Shalev-Lifshitz

Dec 22 '23 08:12 Zhoues

STEVE-1 STEVE-1 copied to clipboard

Some questions of interest regarding the details of Prior training.

STEVE-1
STEVE-1 copied to clipboard