adanet
adanet copied to clipboard
Combination of subnetworks
In Google's paper, "each unit in layer k of the subnetwork may have connections to existing units in layer k-1 if AdaNet", but the GIF shows differently. How do the subnetwoks combine exactly?
@JodyZXD: Good observation. The GIF just demos a simple example. The AdaNet framework supports a superset of the subnetworks and connections that the paper defines. You can recreate the neural network from the paper by passing the hidden_layer outputs through the Subnetwork.persisted_tensors
dict. This will make these tensors available across iterations.
For instance, you can do something like the following change to simple_dnn.py
:
def build_subnetwork(self,
features,
logits_dimension,
training,
iteration_step,
summary,
previous_ensemble):
"""See `adanet.subnetwork.Builder`."""
input_layer = tf.feature_column.input_layer(
features=features, feature_columns=self._feature_columns)
last_layer = input_layer
persisted_tensors = {_NUM_LAYERS_KEY: tf.constant(self._num_layers)}
for i in range(self._num_layers):
last_layer = tf.layers.dense(
last_layer,
units=self._layer_size,
activation=tf.nn.relu,
kernel_initializer=tf.glorot_uniform_initializer(seed=self._seed))
last_layer = tf.layers.dropout(
last_layer, rate=self._dropout, seed=self._seed, training=training)
hidden_layer_key = "hidden_layer_{}".format(i)
if previous_ensemble:
# Iteration t>0.
last_subnetwork = previous_ensemble.weighted_subnetworks[-1].subnetwork
last_layer = tf.concat([last_subnetwork.persisted_tensors[hidden_layer_key]), last_layer], axis=1)
# Store hidden layer outputs for subsequent iterations.
persisted_tensors[hidden_layer_key] = last_layer
logits = tf.layers.dense(
last_layer,
units=logits_dimension,
kernel_initializer=tf.glorot_uniform_initializer(seed=self._seed))
# Approximate the Rademacher complexity of this subnetwork as the square-
# root of its depth.
complexity = tf.sqrt(tf.to_float(self._num_layers))
with tf.name_scope(""):
summary.scalar("complexity", complexity)
summary.scalar("num_layers", self._num_layers)
return adanet.Subnetwork(
last_layer=last_layer,
logits=logits,
complexity=complexity,
persisted_tensors=persisted_tensors)
@cweill Thanks a lot! It's instructive for me! I have another question here: will you give a GPU guide before long?
AdaNet works on GPU just like any other TensorFlow Estimator. Any guide on the web to do GPU training should get you started.
You can also try GPU on Colab by changing the runtime hardware to GPU
:
https://colab.research.google.com/github/tensorflow/adanet/blob/master/adanet/examples/tutorials/customizing_adanet.ipynb
@cweill Thanks for your great work! If I want to combine simple cnn and simple dnn together, how could I do? For example, I want to get a network "2_layer_dnn -> cnn -> cnn”
@cweil Thanks for the insight on building network! However I've tested the build_subnetwork code you posted above (plugging it into the SimpleDNNBuilder) and it does not work. I'm not getting how to exactly sample the network structural space as described in the paper. Any indication or additional piece of code would be most welcome!
An alternative approach can be the following builder
def build_subnetwork(self,
features,
logits_dimension,
training,
iteration_step,
summary,
previous_ensemble=None):
"""See `adanet.subnetwork.Builder`."""
input_layer = tf.to_float(features['x'])
kernel_initializer = tf.glorot_uniform_initializer(seed=self._seed)
last_layer = input_layer
for layer_size in self._layer_sizes:
last_layer = tf.layers.dense(
last_layer,
units=layer_size,
activation=self._activation,
kernel_initializer=kernel_initializer)
logits = tf.layers.dense(
last_layer,
units=logits_dimension,
kernel_initializer=kernel_initializer)
persisted_tensors = {
"num_layers": tf.constant(self._num_layers),
"layer_sizes": tf.constant(self._layer_sizes),
}
return adanet.Subnetwork(
last_layer=last_layer,
logits=logits,
complexity=self._measure_complexity(),
persisted_tensors=persisted_tensors)
and move the logic for exploring in the generator
def generate_candidates(self, previous_ensemble, iteration_number,
previous_ensemble_reports, all_reports):
"""See `adanet.subnetwork.Generator`."""
seed = self._seed
if seed is not None:
seed += iteration_number
# start with single layer
num_layers = 1
layer_sizes = [self.layer_block_size]
# take the maximum depth reached in previous iterations + 1
if previous_ensemble:
last_subnetwork = previous_ensemble.weighted_subnetworks[
-1].subnetwork
persisted_tensors = last_subnetwork.persisted_tensors
num_layers = tf.contrib.util.constant_value(
persisted_tensors["num_layers"])
layer_sizes = list(tf.contrib.util.constant_value(
persisted_tensors["layer_sizes"]))
# at each iteration we want to check if exdending any of the
# existing layes is good
candidates = list()
for extend_layer in range(num_layers):
new_sizes = layer_sizes[:]
new_sizes[extend_layer] += self.layer_block_size
candidates.append(
self._dnn_builder_fn(
num_layers=num_layers,
layer_sizes=new_sizes,
seed=seed,
previous_ensemble=previous_ensemble))
# also check if it's worth adding a new layer
candidates.append(
self._dnn_builder_fn(
num_layers=num_layers + 1,
layer_sizes=layer_sizes + [self.layer_block_size],
seed=seed,
previous_ensemble=previous_ensemble))
# also keep the un-extended candidate
candidates.append(
self._dnn_builder_fn(
num_layers=num_layers,
layer_sizes=layer_sizes,
seed=seed,
previous_ensemble=previous_ensemble))
return candidates
@cweil what do you think?
for t iteration it concat with the previous network ,whether the weight is frozen or not