Distilling-Object-Detectors about imitation loss weight

Thanks for your code ! When i use imitation loss in my dataset and work, i'm confused about how to determine the imitation loss weight, without imitation loss, my total loss is about 1e-2, with default imitation loss weight(0.01), my imitation loss is about 1.3, how can i balance other loss and imitation loss?

Feb 23 '21 02:02 chongkuiqi

hi could rephrase your questions?

chongkuiqi [email protected] 于 2021年2月23日周二 10:50写道：

Thanks for your code ! When i use imitation loss in my dataset and work, i'm confused about how to determine the imitation loss weight, without imitation loss, my total loss is about 1e-2, with default imitation loss weight(0.01), my imitation loss is about 1.3, how can i balance other loss and imitation loss?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/twangnh/Distilling-Object-Detectors/issues/25, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM63H4VWPF72MVX2WDTTAMJXRANCNFSM4YBVAMVA .

Feb 23 '21 09:02 twangnh

hi could rephrase your questions? chongkuiqi [email protected] 于 2021年2月23日周二 10:50写道： … Thanks for your code ! When i use imitation loss in my dataset and work, i'm confused about how to determine the imitation loss weight, without imitation loss, my total loss is about 1e-2, with default imitation loss weight(0.01), my imitation loss is about 1.3, how can i balance other loss and imitation loss? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#25>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM63H4VWPF72MVX2WDTTAMJXRANCNFSM4YBVAMVA .

就是说在训练初始阶段，我的loss(分类损失+定位损失)大约在1e-2，imitation loss大约为300左右，我是不是应该给imitation loss一个很小的权重(10-4)，让imitation loss与loss差不多？或者说这两个损失保持怎样的比例比较好？

Feb 23 '21 09:02 chongkuiqi

for custom data, I suggest you first keep the two-loss at similar level, then tune the imitation loss.

chongkuiqi [email protected] 于 2021年2月23日周二 17:32写道：

hi could rephrase your questions? chongkuiqi [email protected] 于 2021年2月23日周二 10:50写道： … <#m_-9103245680687119030_m_-6708683990822357179_> Thanks for your code ! When i use imitation loss in my dataset and work, i'm confused about how to determine the imitation loss weight, without imitation loss, my total loss is about 1e-2, with default imitation loss weight(0.01), my imitation loss is about 1.3, how can i balance other loss and imitation loss? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#25 https://github.com/twangnh/Distilling-Object-Detectors/issues/25>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM63H4VWPF72MVX2WDTTAMJXRANCNFSM4YBVAMVA .

就是说在训练初始阶段，我的loss(分类损失+定位损失)大约在1e-2，imitation loss大约为300左右，我是不是应该给imitation loss一个很小的权重(10-4)，让imitation loss与loss差不多？或者说这两个损失保持怎样的比例比较好？

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/twangnh/Distilling-Object-Detectors/issues/25#issuecomment-784035305, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM3RLZTSMWAXSBD6AHDTANY3HANCNFSM4YBVAMVA .

Feb 24 '21 10:02 twangnh

for custom data, I suggest you first keep the two-loss at similar level, then tune the imitation loss. chongkuiqi [email protected] 于 2021年2月23日周二 17:32写道： … hi could rephrase your questions? chongkuiqi @.*** 于 2021年2月23日周二 10:50写道： … <#m_-9103245680687119030_m_-6708683990822357179_> Thanks for your code ! When i use imitation loss in my dataset and work, i'm confused about how to determine the imitation loss weight, without imitation loss, my total loss is about 1e-2, with default imitation loss weight(0.01), my imitation loss is about 1.3, how can i balance other loss and imitation loss? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#25 <#25>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM63H4VWPF72MVX2WDTTAMJXRANCNFSM4YBVAMVA . 就是说在训练初始阶段，我的loss(分类损失+定位损失)大约在1e-2，imitation loss大约为300左右，我是不是应该给imitation loss一个很小的权重(10-4)，让imitation loss与loss差不多？或者说这两个损失保持怎样的比例比较好？ — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#25 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM3RLZTSMWAXSBD6AHDTANY3HANCNFSM4YBVAMVA .

Thanks! I tried and find it better that the imitation loss is about 6~10 times other loss. More, is the kernel size of the adaptation layer important ? I mean i find you use 3x3 kernel with padding=1, what if using 1x1 kernel ?

Feb 25 '21 03:02 chongkuiqi

we did not examine the choice of adaptation kernel size, you can try tune if on you data.

On Thu, Feb 25, 2021 at 11:24 AM chongkuiqi [email protected] wrote:

for custom data, I suggest you first keep the two-loss at similar level, then tune the imitation loss. chongkuiqi [email protected] 于 2021年2月23日周二 17:32写道： … <#m_5998101465384950157_> hi could rephrase your questions? chongkuiqi @.*** 于 2021年2月23日周二 10:50写道： … <#m_-9103245680687119030_m_-6708683990822357179_> Thanks for your code ! When i use imitation loss in my dataset and work, i'm confused about how to determine the imitation loss weight, without imitation loss, my total loss is about 1e-2, with default imitation loss weight(0.01), my imitation loss is about 1.3, how can i balance other loss and imitation loss? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#25 https://github.com/twangnh/Distilling-Object-Detectors/issues/25 <#25 https://github.com/twangnh/Distilling-Object-Detectors/issues/25>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM63H4VWPF72MVX2WDTTAMJXRANCNFSM4YBVAMVA . 就是说在训练初始阶段，我的loss(分类损失+定位损失)大约在1e-2，imitation loss大约为300左右，我是不是应该给imitation loss一个很小的权重(10-4)，让imitation loss与loss差不多？或者说这两个损失保持怎样的比例比较好？ — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#25 (comment) https://github.com/twangnh/Distilling-Object-Detectors/issues/25#issuecomment-784035305>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM3RLZTSMWAXSBD6AHDTANY3HANCNFSM4YBVAMVA .

Thanks! I tried and find it better that the imitation loss is about 6~10 times other loss. More, is the kernel size of the adaptation layer important ? I mean i find you use 3x3 kernel with padding=1, what if using 1x1 kernel ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/twangnh/Distilling-Object-Detectors/issues/25#issuecomment-785548406, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKMYBOTZGL357NRFGAZTTAW7HNANCNFSM4YBVAMVA .

Feb 25 '21 05:02 twangnh

we did not examine the choice of adaptation kernel size, you can try tune if on you data. … On Thu, Feb 25, 2021 at 11:24 AM chongkuiqi @.> wrote: for custom data, I suggest you first keep the two-loss at similar level, then tune the imitation loss. chongkuiqi @. 于 2021年2月23日周二 17:32写道： … <#m_5998101465384950157_> hi could rephrase your questions? chongkuiqi @.*** 于 2021年2月23日周二 10:50写道： … <#m_-9103245680687119030_m_-6708683990822357179_> Thanks for your code ! When i use imitation loss in my dataset and work, i'm confused about how to determine the imitation loss weight, without imitation loss, my total loss is about 1e-2, with default imitation loss weight(0.01), my imitation loss is about 1.3, how can i balance other loss and imitation loss? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#25 <#25> <#25 <#25>>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM63H4VWPF72MVX2WDTTAMJXRANCNFSM4YBVAMVA . 就是说在训练初始阶段，我的loss(分类损失+定位损失)大约在1e-2，imitation loss大约为300左右，我是不是应该给imitation loss一个很小的权重(10-4)，让imitation loss与loss差不多？或者说这两个损失保持怎样的比例比较好？ — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#25 (comment) <#25 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKM3RLZTSMWAXSBD6AHDTANY3HANCNFSM4YBVAMVA . Thanks! I tried and find it better that the imitation loss is about 6~10 times other loss. More, is the kernel size of the adaptation layer important ? I mean i find you use 3x3 kernel with padding=1, what if using 1x1 kernel ? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#25 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AELTKMYBOTZGL357NRFGAZTTAW7HNANCNFSM4YBVAMVA .

Thanks ! 1x1 kernel size is better for my data.

Feb 27 '21 15:02 chongkuiqi

Distilling-Object-Detectors Distilling-Object-Detectors copied to clipboard

about imitation loss weight

Distilling-Object-Detectors
Distilling-Object-Detectors copied to clipboard