Question: Task ordering strategy during training stage2, 3

Open shin-wn opened this issue 1 year ago • 1 comments

Thank you for the great work on this project.

I noticed that tasks are trained in a fixed order rather than being shuffled:

In stage 2, tasks follow the sequence t2m -> m2t -> predict for one epoch
In stage 3, tasks appear to be processed in the order defined in the JSON file

Since motion is treated as discrete data similar to text, using the same loss function across tasks should be possible. This makes me wonder about the following:

Was there a specific reason for not shuffling tasks during training?
Have you found better results with this fixed-order approach compared to random task selection?
Did you experiment with randomly selecting tasks in pretraining (stage 2) and instruction tuning (stage 3)?

I'm curious to learn more about the design decisions behind this approach. Looking forward to hearing your insights!

Oct 22 '24 11:10 shin-wn

Thank you for the great work on this project.

I noticed that tasks are trained in a fixed order rather than being shuffled:

In stage 2, tasks follow the sequence t2m -> m2t -> predict for one epoch

In stage 3, tasks appear to be processed in the order defined in the JSON file

Since motion is treated as discrete data similar to text, using the same loss function across tasks should be possible. This makes me wonder about the following:

Was there a specific reason for not shuffling tasks during training?

Have you found better results with this fixed-order approach compared to random task selection?

Did you experiment with randomly selecting tasks in pretraining (stage 2) and instruction tuning (stage 3)?

I'm curious to learn more about the design decisions behind this approach. Looking forward to hearing your insights!

Hi, I am also curious about this strategy. Have you tried to train the model with shuffled tasks? Are there any interesting observations?

Mar 17 '25 13:03 Lyman-Smoker