adanet
adanet copied to clipboard
[Question] about nasnet-a used as subnetwork in adanet
@cweill
hi,Weill,
I saw an article on the Google ai blog:https://ai.googleblog.com/2018/10/introducing-adanet-fast-and-flexible.html
The article mentions using nasnet-a as subnetwork after 8 adanet iteration and get the error-rate of 2.3% on cifar-10
and with fewer parameters at the same time. I would like to ask two questions.
1.do you use the entire nasnet-a architecture network(for exapmle, N=6 and F=32) as subnetwork, or use the normal cell as subnetwork or something else.
2. how do the subnetworks combine to each other.
Thanks !
@tianxh1212 : With the following code and settings you should be able to get the same results: https://github.com/tensorflow/adanet/blob/master/adanet/examples/nasnet.py#L181
- We used the N=6, F=32 in the config. A single subnetwork with those settings should have 3.3M parameters. Also I think we disabled
use_aux_head. - We used the Estimator
force_grow=Truesetting. - We used the
SCALARmixture weights,use_bias=False,max_iteration_steps=1000000and all the other Estimator settings were at their defaults. So we simply took the average of the subnetworks' outputs at each iteration. We also ran for 10 iterations.
@cweill hi,Weill, I got some errors, like this: ValueError:'generator' yielded an element of shape (1, ) where an element of shape () was expected Note that:
- the input data that I used is CIFAR-10 dataset, which was preprocessed and defined using input_fn like that in adanet/adanet/examples/tutorials/customizing_adanet.ipynb. For example: dataset = tf.data.Dataset.from_generator(generator(x_train, y_train), (tf.float32, tf.int32), ((32, 32, 3), ()))
- What I used is tensorflow 1.12, python3.6
Solution:I resolve the above question by setting:
- dataset = tf.data.Dataset.from_generator(generator(x_train, y_train), (tf.float32, tf.int32), ((32, 32, 3), 1))
@SmallyolkLiu: Have a look at our research code that uses NASNet in AdaNet. It shows you how you can get it working on Google Cloud MLE.
Hi @cweill . Thanks for all the great work. Did you apply any data augmentation? In your research code, I see that you apply basic augmentation (flip + crop) to input images. Did you do the same to the performance reported in the blog? Thanks!
@tl-yang: Please see the details in our recent paper: https://arxiv.org/abs/1903.06236