adanet icon indicating copy to clipboard operation
adanet copied to clipboard

Combination of subnetworks

Open JodyZXD opened this issue 6 years ago • 7 comments

In Google's paper, "each unit in layer k of the subnetwork may have connections to existing units in layer k-1 if AdaNet", but the GIF shows differently. How do the subnetwoks combine exactly?

JodyZXD avatar Nov 10 '18 04:11 JodyZXD

@JodyZXD: Good observation. The GIF just demos a simple example. The AdaNet framework supports a superset of the subnetworks and connections that the paper defines. You can recreate the neural network from the paper by passing the hidden_layer outputs through the Subnetwork.persisted_tensors dict. This will make these tensors available across iterations.

For instance, you can do something like the following change to simple_dnn.py:

def build_subnetwork(self,
                      features,
                      logits_dimension,
                      training,
                      iteration_step,
                      summary,
                      previous_ensemble):
    """See `adanet.subnetwork.Builder`."""

    input_layer = tf.feature_column.input_layer(
        features=features, feature_columns=self._feature_columns)
    last_layer = input_layer
    persisted_tensors = {_NUM_LAYERS_KEY: tf.constant(self._num_layers)}
    for i in range(self._num_layers):
      last_layer = tf.layers.dense(
          last_layer,
          units=self._layer_size,
          activation=tf.nn.relu,
          kernel_initializer=tf.glorot_uniform_initializer(seed=self._seed))
      last_layer = tf.layers.dropout(
          last_layer, rate=self._dropout, seed=self._seed, training=training)
      hidden_layer_key = "hidden_layer_{}".format(i)
      if previous_ensemble:
          # Iteration t>0.
          last_subnetwork = previous_ensemble.weighted_subnetworks[-1].subnetwork
          last_layer = tf.concat([last_subnetwork.persisted_tensors[hidden_layer_key]), last_layer], axis=1)
      # Store hidden layer outputs for subsequent iterations.
      persisted_tensors[hidden_layer_key] = last_layer
    logits = tf.layers.dense(
        last_layer,
        units=logits_dimension,
        kernel_initializer=tf.glorot_uniform_initializer(seed=self._seed))

    # Approximate the Rademacher complexity of this subnetwork as the square-
    # root of its depth.
    complexity = tf.sqrt(tf.to_float(self._num_layers))

    with tf.name_scope(""):
      summary.scalar("complexity", complexity)
      summary.scalar("num_layers", self._num_layers)

    return adanet.Subnetwork(
        last_layer=last_layer,
        logits=logits,
        complexity=complexity,
        persisted_tensors=persisted_tensors)

cweill avatar Nov 10 '18 19:11 cweill

@cweill Thanks a lot! It's instructive for me! I have another question here: will you give a GPU guide before long?

JodyZXD avatar Nov 13 '18 03:11 JodyZXD

AdaNet works on GPU just like any other TensorFlow Estimator. Any guide on the web to do GPU training should get you started.

You can also try GPU on Colab by changing the runtime hardware to GPU:

https://colab.research.google.com/github/tensorflow/adanet/blob/master/adanet/examples/tutorials/customizing_adanet.ipynb

cweill avatar Nov 13 '18 04:11 cweill

@cweill Thanks for your great work! If I want to combine simple cnn and simple dnn together, how could I do? For example, I want to get a network "2_layer_dnn -> cnn -> cnn”

tobymu avatar Nov 21 '18 11:11 tobymu

@cweil Thanks for the insight on building network! However I've tested the build_subnetwork code you posted above (plugging it into the SimpleDNNBuilder) and it does not work. I'm not getting how to exactly sample the network structural space as described in the paper. Any indication or additional piece of code would be most welcome!

martinobertoni avatar Jan 23 '19 16:01 martinobertoni

An alternative approach can be the following builder

    def build_subnetwork(self,
                         features,
                         logits_dimension,
                         training,
                         iteration_step,
                         summary,
                         previous_ensemble=None):
        """See `adanet.subnetwork.Builder`."""
        input_layer = tf.to_float(features['x'])
        kernel_initializer = tf.glorot_uniform_initializer(seed=self._seed)
        last_layer = input_layer
        for layer_size in self._layer_sizes:
            last_layer = tf.layers.dense(
                last_layer,
                units=layer_size,
                activation=self._activation,
                kernel_initializer=kernel_initializer)
        logits = tf.layers.dense(
            last_layer,
            units=logits_dimension,
            kernel_initializer=kernel_initializer)

        persisted_tensors = {
            "num_layers": tf.constant(self._num_layers),
            "layer_sizes": tf.constant(self._layer_sizes),
        }
        return adanet.Subnetwork(
            last_layer=last_layer,
            logits=logits,
            complexity=self._measure_complexity(),
            persisted_tensors=persisted_tensors)

and move the logic for exploring in the generator

    def generate_candidates(self, previous_ensemble, iteration_number,
                            previous_ensemble_reports, all_reports):
        """See `adanet.subnetwork.Generator`."""
        seed = self._seed
        if seed is not None:
            seed += iteration_number
        # start with single layer
        num_layers = 1
        layer_sizes = [self.layer_block_size]
        # take the maximum depth reached in previous iterations + 1
        if previous_ensemble:
            last_subnetwork = previous_ensemble.weighted_subnetworks[
                -1].subnetwork
            persisted_tensors = last_subnetwork.persisted_tensors
            num_layers = tf.contrib.util.constant_value(
                persisted_tensors["num_layers"])
            layer_sizes = list(tf.contrib.util.constant_value(
                persisted_tensors["layer_sizes"]))
        # at each iteration we want to check if exdending any of the
        # existing layes is good
        candidates = list()
        for extend_layer in range(num_layers):
            new_sizes = layer_sizes[:]
            new_sizes[extend_layer] += self.layer_block_size
            candidates.append(
                self._dnn_builder_fn(
                    num_layers=num_layers,
                    layer_sizes=new_sizes,
                    seed=seed,
                    previous_ensemble=previous_ensemble))
        # also check if it's worth adding a new layer
        candidates.append(
            self._dnn_builder_fn(
                num_layers=num_layers + 1,
                layer_sizes=layer_sizes + [self.layer_block_size],
                seed=seed,
                previous_ensemble=previous_ensemble))
        # also keep the un-extended candidate
        candidates.append(
            self._dnn_builder_fn(
                num_layers=num_layers,
                layer_sizes=layer_sizes,
                seed=seed,
                previous_ensemble=previous_ensemble))
        return candidates

@cweil what do you think?

martinobertoni avatar Jan 25 '19 16:01 martinobertoni

for t iteration it concat with the previous network ,whether the weight is frozen or not

InkdyeHuang avatar Jun 19 '19 03:06 InkdyeHuang