deepxde icon indicating copy to clipboard operation
deepxde copied to clipboard

Bug: activation function is not applied to the last layer

Open gnodking7 opened this issue 2 years ago • 4 comments

Hello,

First of all, thank you for such a wonderful package and the recent update to include different activation function for different layer.

However, there seems to be a bug with the output layer. I first recognized the issue when I observed negative outputs despite setting the activation function at my output layer as 'sigmoid'.

Currently, activation function is not being applied to the output layer; for example, currently line 98 of deepxde.nn.tensorflow_compat_v1.fnn is: self.y = self._dense(y, self.layer_size[-1], use_bias=self.use_bias)

which should be changed to something like (at least when the user wants activation other than linear for the output layer): self.y = self._dense(y, self.layer_size[-1], activation=( self.activation[-1] if isinstance(self.activation, list) else self.activation ), use_bias=self.use_bias)

Thank you

gnodking7 avatar Jan 10 '23 23:01 gnodking7

you can use the output scale.

from deepxde import backend as bkd

def output_transform(x,y):
     return bkd.sigmoid(y) # I assume that output is 1d
net.apply_output_transform(output_transform)

tsarikahin avatar Jan 12 '23 16:01 tsarikahin

It is not a bug. The last layer should not use activation. But you can follow @tsarikahin 's suggestion.

lululxvi avatar Jan 18 '23 15:01 lululxvi

Dear @lululxvi,

Could you explain why the last layer should not use activation? I am fairly new to deepXDE (and to PINNS in general) and working on solving an advection-diffusion equation in 2D. I had the same problem as this issue where I had set my last layer as sigmoid to prevent unphysical negative results but this was not working, now I know why. Implementing the answer here solved my problem but I would like to understand why this is the case and if there are other ways to control for negative values through the network's architecture or if the output transform is the best solution for this. Thank you.

MAKassien avatar Nov 13 '23 21:11 MAKassien

Output transform is the solution.

lululxvi avatar Nov 14 '23 03:11 lululxvi