open_clip
open_clip copied to clipboard
Add synthetic + real data benchmarking
Adding benchmarking functionality through benchmark.py
. It may be called identically to main.py
. It includes 2 new flags:
-
--synthetic-data
, a boolean flag that allows a user to forgo data loading -
--bench-steps
, an integer representing the number of steps to simulate
The only limitation is that this does NOT support crossing of the epoch boundary, thus real data scope is strictly less than the entire dataset size.
Added a flag --bench-warmup
to allow a few steps to not be recorded during timing so the system can get into a steady state. In addition, I realized that any epoch boundary crossing would likely be amortized and become negligible with time so I have included support for any length of benchmarking.
lot of duplication wrt main.py would need update
but would be best to not duplicate. Can we either put this in main.my, either do functions and reuse them in both places?