hyperas icon indicating copy to clipboard operation
hyperas copied to clipboard

Error when choosing between sets of advanced activations layers

Open ismaeIfm opened this issue 9 years ago • 8 comments

Hi, I am optimizing the number of neurons and the activation function of the layers of a two hidden-layer network for the mnist dataset, but when I tried choosing from a set of advanced activations from keras I get the following error:

ValueError: Incompatible shapes for broadcasting: (?, 128) and (8,)

Here is my model's definition:

model = Sequential()

model.add(Dense({{choice([8, 128])}}, input_dim=784))
model.add({{choice([advanced_activations.ThresholdedReLU(), advanced_activations.SReLU()])}})

model.add(Dense({{choice([8, 128])}}))
model.add({{choice([advanced_activations.ThresholdedReLU(), advanced_activations.SReLU()])}})

model.add(Dense(10))
model.add(Activation('softmax'))

I think it is a issue on how hyperas manages the space of the parameters to tune.

ismaeIfm avatar Aug 01 '16 03:08 ismaeIfm

Hi @ismaeIfm, that is a very interesting finding, thanks for that. In fact, I translated the example into pure keras + hyperopt and the problem persists. As hyperas just bridges the two, it's not directly a hyperas issue, but I still want to understand it better. It seems shape inference fails for some reason, but right now I'm not sure why.

maxpumperla avatar Aug 01 '16 12:08 maxpumperla

@maxpumperla maybe I'm wrong, but I noticed that layers are recycled, ie, if an evaluation of hyperas has initialized the advanced layer i as SRELU, as a result of choice(), and later in other evaluation the layer i gets SRELU, the memory address for the layer i (SRELU) is the same in both evaluations.

So if I'm correct, I'm guessing that the error comes from recycling the layer, because the size of the input for the layer differs between evaluations.

ismaeIfm avatar Aug 01 '16 14:08 ismaeIfm

@maxpumperla @ismaeIfm I confronted with this issue as well...I will focus more on this issue to find the reason...

ghost avatar Dec 05 '16 08:12 ghost

@ismaeIfm that makes perfect sense, but I have currently no idea how to circumvent that. Might have to dig deeper into hyperopt for that.

maxpumperla avatar Dec 05 '16 15:12 maxpumperla

Could you please kindly make hyperas support advanced layers as they are very helpful in some sort of problems...

ghost avatar Dec 25 '16 05:12 ghost

Hi and thanks for this handy wrapper. You can actually get around recycling the layers (for me this solves the issue) by taking control over the layer instantiation. Instead of choosing between instantiated layers you can define a function that switches between the different layers and then let "choice" act on names only. For the example above this would mean:

def activation(name):
    if name == 'ThresholdedReLU':
        return advanced_activations.ThresholdedReLU()
    if name == 'SReLU':
        return advanced_activations.SReLU()

The line(s)

model.add({{choice([advanced_activations.ThresholdedReLU(), advanced_activations.SReLU()])}})

would then be replaced by:

model.add(activation({{choice(['ThresholdedReLU', 'SReLU'])}}))

chleibig avatar Jan 25 '17 10:01 chleibig

@chleibig cool, that's immensely helpful. Thanks! A lot of people seem to have problems with that.

@ErmiaAzarkhalili does that work for you, dude?

maxpumperla avatar Jan 25 '17 10:01 maxpumperla

Yes, thanks Max, it was awesome...

ghost avatar Jan 25 '17 20:01 ghost