Aber Hu comments

Results 4 comments of


                                            Aber Hu

Why not do log_softmax("arch_param") in the graph？

Hi, thanks for your attention to our repo. Originally, for convenience, we define the variable "log_alphas" as the log probability distribution for operations. After each architecture optimization step, this defination...

Why not do log_softmax("arch_param") in the graph？

@touchdreamer The width search only occurs on `depth_conv`. The output of `depth_conv` is the input to `point_linear`, and the shape of convolutional weights in pytorch is (C_out, C_in/groups, k_h, k_w)....

Could you open source code of making latency lookup tables?

Hi, please refer to [make_lat_lut_example.py](https://github.com/AberHu/TF-NAS/blob/master/latency_pkl/make_lat_lut_example.py). It is an example script to make latency lookup tables. Due to the fine-grained width search, we enumerate all possible width choices, which is time...

which knowledge-distillation loss is best in "kd_losses "?

Sorry for later reply. From my perspective, different kd losses are suitable for different tasks. For classification, the orginal KD (soft target) is ok, because it can be treated a...