adanet What's the difference between `adanet.Estimator` and `adanet.AutoEnsembleEstimator`

Hi, adanet team, I'm confused between the API of adanet.Estimator and adanet.AutoEnsembleEstimator. I noticed that adanet.AutoEnsembleEstimator was released in the version 0.4, but the tutorial provided here are all adanet.Estimator. Is there any suggestion to choose this two API?

Oct 15 '19 09:10 zhiqwang

In addition to that question, when I use adanet.AutoEnsembleEstimator to train model from candidate pool of four different dnn subnetwork, I always get the result network with only one subnetwork. Don't know if I have done something wrong with AutoEnsembleEstimator or it is just how it works. From my understanding, the adanet.AutoEnsembleEstimator can ensemble the subnetwork auutomatically. Does that mean I can only ensemble subnetwork with adanet.Estimator like how it works in the tutorial?

Here is my code and the result of the code.

# Lint as: python3
import numpy as np
import tensorflow as tf
from time import time
from datetime import datetime

from absl import app
import adanet

def main(args):
  (x_train, y_train), (x_test, y_test) = (
      tf.keras.datasets.boston_housing.load_data())


  def input_fn(partition):

    def _input_fn():
      feat_tensor_dict = {}
      if partition == 'train':
        x = x_train.copy()
        y = y_train.copy()
      else:
        x = x_test.copy()
        y = y_test.copy()
      for i in range(0, np.size(x, 1)):
        feat_nam = ('feat' + str(i))
        feat_tensor_dict[feat_nam] = tf.convert_to_tensor(
            x[:, i], dtype=tf.float32)
      label_tensor = tf.convert_to_tensor(y, dtype=tf.float32)
      return (feat_tensor_dict, label_tensor)

    return _input_fn


  feat_nam_lst = ['feat' + str(i) for i in range(0, np.size(x_train, 1))]

  feature_columns = []
  for item in feat_nam_lst:
    feature_columns.append(tf.feature_column.numeric_column(item))

  head = tf.estimator.RegressionHead(1)

  lr_estimator = tf.estimator.LinearEstimator(
      head=head, feature_columns=feature_columns)

  dnn_estimator_1 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[5])

  dnn_estimator_2 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[5, 5])
  
  dnn_estimator_3 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[100,100])

  dnn_estimator_4 = tf.estimator.DNNRegressor(
      feature_columns=feature_columns, hidden_units=[50, 1500])
  
  folder_dir =  "/Users/zhangjue/Desktop/autoensemble/" 
  logdir_adanet = folder_dir + "adanet/" +datetime.now().strftime("%Y%m%d-%H%M%S")

  config = tf.estimator.RunConfig(model_dir=logdir_adanet)
  estimator = adanet.AutoEnsembleEstimator(
      head=head,
      candidate_pool=lambda config: {
          'dnn1': dnn_estimator_1,
          'dnn2': dnn_estimator_2,
          'dnn3': dnn_estimator_3,
          'dnn4': dnn_estimator_4
      },
      max_iteration_steps=5000,
      config=config)

  train_spec = tf.estimator.TrainSpec(
          input_fn=input_fn(partition = "train")
          ,max_steps = 5000)
  eval_spec = tf.estimator.EvalSpec(
          input_fn=input_fn(partition='test'))
  
 
  result,_ = tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  print(result)


if __name__ == "__main__":
  app.run(main)

Here is the result.

{'architecture/adanet/ensembles': b'\n/\n\x13architecture/adanetB\x0e\x08\x07\x12\x00B\x08| dnn4 |J\x08\n\x06\n\x04text', 'average_loss': 28.492062, 'best_ensemble_index_0': 3, 'iteration': 0, 'label/mean': 23.078432, 'loss': 28.492102, 'prediction/mean': 22.895338, 'global_step': 5000}

Oct 21 '19 22:10 jeffltc

@jeffltc If you want to do multiple boosting rounds, make sure that max_steps > max_iteration_steps. For example if max_steps == max_iteration_steps, you will only do one round (no boosting). if max_steps == 3 * max_iteration_steps, then it will boost for three rounds.

You should also create your estimators within the candidate_pool lambda, use an adanet.Evaluator for evaluating candidate performance, and ensemble_strategies for trying different ensemble techniques. For example:

# Lint as: python3
import numpy as np
import tensorflow as tf
from time import time
from datetime import datetime

from absl import app
import adanet

def main(args):
  (x_train, y_train), (x_test, y_test) = (
      tf.keras.datasets.boston_housing.load_data())


  def input_fn(partition):

    def _input_fn():
      feat_tensor_dict = {}
      if partition == 'train':
        x = x_train.copy()
        y = y_train.copy()
      else:
        x = x_test.copy()
        y = y_test.copy()
      for i in range(0, np.size(x, 1)):
        feat_nam = ('feat' + str(i))
        feat_tensor_dict[feat_nam] = tf.convert_to_tensor(
            x[:, i], dtype=tf.float32)
      label_tensor = tf.convert_to_tensor(y, dtype=tf.float32)
      return (feat_tensor_dict, label_tensor)

    return _input_fn


  feat_nam_lst = ['feat' + str(i) for i in range(0, np.size(x_train, 1))]

  feature_columns = []
  for item in feat_nam_lst:
    feature_columns.append(tf.feature_column.numeric_column(item))

  head = tf.estimator.RegressionHead(1)
  
  folder_dir =  "/Users/zhangjue/Desktop/autoensemble/" 
  logdir_adanet = folder_dir + "adanet/" +datetime.now().strftime("%Y%m%d-%H%M%S")

  config = tf.estimator.RunConfig(model_dir=logdir_adanet)
  estimator = adanet.AutoEnsembleEstimator(
      head=head,
      ensemble_strategies=[
          adanet.ensemble.GrowStrategy(), 
          adanet.ensemble.AllStrategy(),
      ],
      candidate_pool=lambda config: {
         "lr": tf.estimator.LinearEstimator(
                 head=head, feature_columns=feature_columns, config=config),
         "dnn1": tf.estimator.DNNRegressor(
               feature_columns=feature_columns, hidden_units=[5], config=config),
         "dnn2": tf.estimator.DNNRegressor(
                    feature_columns=feature_columns, hidden_units=[5, 5], config=config),
         "dnn3": tf.estimator.DNNRegressor(
                   feature_columns=feature_columns, hidden_units=[100,100], config=config),
         "dnn4":  tf.estimator.DNNRegressor(
                  feature_columns=feature_columns, hidden_units=[50, 1500], config=config),
      },
      max_iteration_steps=5000,
      evaluator=adanet.Evaluator(input_fn=input_fn(partition='test')),
      config=config)

  train_spec = tf.estimator.TrainSpec(
          input_fn=input_fn(partition = "train"),
          max_steps = 5000 * 3)
  eval_spec = tf.estimator.EvalSpec(
          input_fn=input_fn(partition='test'))
  
 
  result,_ = tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
  print(result)


if __name__ == "__main__":
  app.run(main)

Oct 22 '19 19:10 cweill

@cweill Thank you so much! Crystal clear! Could you please introduce the difference between adanet.Estimator and adanet.AutoEnsembleEstimator? Sorry for asking the additional question and misleading the original question.

Oct 22 '19 21:10 jeffltc

@jeffltc I'm glad I could help!

@zhiqwang: AutoEnsembleEstimator and adanet.Estimator are very similar.

AutoEnsembleEstimator is a thin wrapper around the latter which converts tf.estimator.Estimator instances into adanet.subnetwork.Builders for adanet.Estimator to train and combine into ensembles.

If you already have a tf.estimator.Estimator you want to ensemble, you should use AutoEnsembleEstimator. But if you want more control, or to do something more sophisticated with the TensorFlow graph, you can use the adanet.Estimator.

Oct 23 '19 19:10 cweill