Jason Ansel
Jason Ansel
Yup, that worked. Thanks!
I'm also noticing very low CPU utilization. Usually only 1 thread active. Did something change in how we setup threading?
``` ./torchbench.py --nothing -n100 -k alexnet ```
For those who haven't heard of [TorchDynamo/TorchInductor](https://github.com/pytorch/torchdynamo), it is automatically fusing and mapping PyTorch to [Triton](https://github.com/openai/triton).
You need to measure the size of the program, something like: `Result(time=os.stat(output_dir).st_size)` There is also more complex objectives. See the petabricks example for `ThresholdAccuracyMinimizeTime`
Multiple objectives allows you to do things like "find the fastest program under 1MB in size". You can define a custom objective to optimize the criteria you care about.
Examples of custom objectives are here: https://github.com/jansel/opentuner/blob/master/opentuner/search/objective.py#L212 That paper uses http://ctuning.org/
This almost looks like it didn't run for long enough since `self.search_driver.best_result` is None, how many iterations is this?
It depends on the search algorithm, many require some warmup iterations. If you set iterations to 100 or 1000 does it fix the problem? You could make this error go...
It is for debugging/analysis. It prints out the top 20 flags most commonly used in the final (best) configurations over past training runs. So if you train a single program...