MultiObjectiveOptimization icon indicating copy to clipboard operation
MultiObjectiveOptimization copied to clipboard

inconsistent implementation of update for task specific weights with the description in paper

Open bsaint opened this issue 5 years ago • 3 comments

Hi, thanks for releasing this awesome code! Currently, i am working on reproducing the result on cityscapes in paper. I found that in paper the description of mtl update equation say the weights of task specific subnetwork should be updated with original learning rate, then the shared weights of network is updated with the MGDA algorithm. But i didnt find the corresponding implementation in code where both the shared weights and task specific weights are updated consistently by timing loss of different task with a weight factor determined by MGDA. Am i missing something here, or is this a implemention trick?

bsaint avatar Jun 18 '19 02:06 bsaint

Hi @ozansener, I'm also trying to reproduce and utilize the method. And the above also confuses me a bit as well. Algorithm 2 line 2 shows the that the task specific params are updated without any scaling factor, then line 4 would be replaced with the solver using your approximation and alphas would be calculated using gradients of Lt with respect Z. Then in line 5, only the shared parameters are updated with the alpha weighed sum of losses.

However, your implementation uses only one optimizer on both shared and task specific parameters: https://github.com/intel-isl/MultiObjectiveOptimization/blob/1c6d0d503ccf33cc83d5b6c356ca2fc2bf255606/multi_task/train_multi_task.py#L57-L65

and updates all of them with alpha weighed gradients: https://github.com/intel-isl/MultiObjectiveOptimization/blob/1c6d0d503ccf33cc83d5b6c356ca2fc2bf255606/multi_task/train_multi_task.py#L174-L185

These two approaches don't seem to be equivalent, is this an unintended change in the implementation or is there something @bsaint and I are missing?

milos-popovic avatar Jul 23 '19 20:07 milos-popovic

@bsaint and @milos-popovic Thanks for raising the issue. You are right. There is a discrepancy between the paper and the code. We used this code to get all the results so please use the codebase. I will run some experiments and will update the paper if necessary.

ozansener avatar Jul 23 '19 22:07 ozansener

Hi, @bsaint, @milos-popovic, Have you figured out the discrepancy and reproduced the results on MultiMNIST, CelebA or CityScapes? Thank you.

liyangliu avatar Apr 14 '20 08:04 liyangliu