Work Estimate / Blockers for References (gradient accumulation and convergence understanding)
Reference owners please update yellow cells with a work estimate (in weeks/days) or a blocking issue that needs to be resolved. https://docs.google.com/spreadsheets/d/1W8L8SBIrgbJ_f_-2hUt8SqLNkzAvsKNkQ0A6pKWz9_8/edit#gid=0
SWG:
Will request reference owners add status updates next week to the spreadsheet.
We had requested a status update in this spreadsheet: https://docs.google.com/spreadsheets/d/1W8L8SBIrgbJ_f_-2hUt8SqLNkzAvsKNkQ0A6pKWz9_8/edit#gid=0
We will touch base next week:
- Does it need gradient accumulation?
- Status on adding gradient accumulation?
- Convergence Curve - https://drive.google.com/drive/u/0/folders/1sDmlkLyehFcQWEEW8IhQUbLafaPhTE-9
Convergence Curves: Run 2x the required runs for submission spread across the historically min submitted batch size and max submitted batch sizes -- running a powers of 2 start at min going to max.
In addition to gradient accumulation and convergence curves, we also need to update logging to the latest v0.7 (or v1.0?) spec. I've added a column in the status spreadsheet for this.
From this week's meeting:
- Tracking spreadsheet: https://docs.google.com/spreadsheets/d/1W8L8SBIrgbJ_f_-2hUt8SqLNkzAvsKNkQ0A6pKWz9_8/edit?usp=sharing
- Reminded group that we plan to freeze on 1/22
- Review existing pull requests https://github.com/mlcommons/training/pulls, all are to be resolved in the next 2 weeks
- If you lack permissions to contribute, please update
- We should make a label to indicate which PRs are going to impact the references
- Follow up over email on status of Minigo
- In progress work for every reference.
- New action item for all reference owners to fix logging, not needed by freeze but we want shortly after. [AI JohnT] Email to be sent to owners. (sent on 1/7/21)
- Want convergence curves by freeze deadline. Let others know if you need help with this.