Ensemble-Pytorch icon indicating copy to clipboard operation
Ensemble-Pytorch copied to clipboard

How to stack embedding and pass the gradients?

Open arita37 opened this issue 3 years ago • 8 comments
trafficstars

Have a 2 neural nets N1, N2, want to stack their output embedding layer.

How to do this ?

arita37 avatar Feb 23 '22 03:02 arita37

Suppose the output of N1 and N2 on a sample x is N1(x) and N2(x) separately, what you want is to concatenate their output (i.e., [N1(x), N2(x)]), and pass it to downstream layers, right?

xuyxu avatar Feb 23 '22 07:02 xuyxu

Exactly: Either concat, or Mean

We need to pass the gradient and end to end training

On Feb 23, 2022, at 16:19, Yi-Xuan Xu @.***> wrote:

 Suppose the output of N1 and N2 on a sample x is N1(x) and N2(x) separately, what you want is to concatenate their output (i.e., [N1(x), N2(x)]), and pass it to downstream layers, right?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

arita37 avatar Feb 23 '22 10:02 arita37

Hi, if you are going to take the mean of outputs from all base estimators, the fusion ensemble is exactly what you want.

As to the concatenation, it is somehow weird since all base estimators in the ensemble are doing the same thing, making concatenating their outputs kind of useless. Is there any paper or technical report demonstrating the effectivenss of concatenating outputs of base estimators?

xuyxu avatar Feb 23 '22 14:02 xuyxu

N1, N2,Nx are different NN models.

We aggregate through concat their embedding output.

BigX = [ X1,….Xn] and feed into another NN (ie merging).

This extensively used (ie Siamese Network…)

On Feb 23, 2022, at 23:54, Yi-Xuan Xu @.***> wrote:

 Hi, if you are going to take the mean of outputs from all base estimators, the fusion ensemble is exactly what you want.

As to the concatenation, it is somehow weird since all base estimators in the ensemble are doing the same thing, making concatenating their outputs kind of useless. Is there any paper or technical report demonstrating the effectivenss of concatenating outputs of base estimators?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

arita37 avatar Feb 24 '22 00:02 arita37

We are NOT dealing with output !!!

Output is kind of useless for End to End training…

We are dealing witht the last embedding.

On Feb 24, 2022, at 9:41, No Ke @.***> wrote:

 N1, N2,Nx are different NN models.

We aggregate through concat their embedding output.

BigX = [ X1,….Xn] and feed into another NN (ie merging).

This extensively used (ie Siamese Network…)

On Feb 23, 2022, at 23:54, Yi-Xuan Xu @.***> wrote:

 Hi, if you are going to take the mean of outputs from all base estimators, the fusion ensemble is exactly what you want.

As to the concatenation, it is somehow weird since all base estimators in the ensemble are doing the same thing, making concatenating their outputs kind of useless. Is there any paper or technical report demonstrating the effectivenss of concatenating outputs of base estimators?

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

arita37 avatar Feb 24 '22 00:02 arita37

Thanks for your kind explanation. Heterogeneous ensemble is not supported yet, since we have not come up with a succinct way on setting different optimizers for different base estimators 😢.

xuyxu avatar Feb 24 '22 01:02 xuyxu

Sure.

At 1st version, Maybe, we can use same optimizer, scheduler for the ensemble model

Goal is to have a one liner for easy Ensemble End to End.

On Feb 24, 2022, at 10:04, Yi-Xuan Xu @.***> wrote:

 Thanks for your kind explanation. Heterogeneous ensemble is not supported yet, since we have not come up with a succinct way on setting different optimizers for different base estimators 😢.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you authored the thread.

arita37 avatar Feb 24 '22 02:02 arita37

Kind of busy these days, will appreciate a PR very much ;-)

xuyxu avatar Feb 27 '22 12:02 xuyxu