Aber Hu
Aber Hu
Hi, thanks for your attention to our repo. Originally, for convenience, we define the variable "log_alphas" as the log probability distribution for operations. After each architecture optimization step, this defination...
@touchdreamer The width search only occurs on `depth_conv`. The output of `depth_conv` is the input to `point_linear`, and the shape of convolutional weights in pytorch is (C_out, C_in/groups, k_h, k_w)....
Hi, please refer to [make_lat_lut_example.py](https://github.com/AberHu/TF-NAS/blob/master/latency_pkl/make_lat_lut_example.py). It is an example script to make latency lookup tables. Due to the fine-grained width search, we enumerate all possible width choices, which is time...
Sorry for later reply. From my perspective, different kd losses are suitable for different tasks. For classification, the orginal KD (soft target) is ok, because it can be treated a...