policies
policies copied to clipboard
Address issue #122 according to Inf WG discussion
This also adds a definition of a Reproducible software component so that RDI does not become a venue for submissions that are not reproducible and can never be reproducible.
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅
Thank you @DilipSequeira for the change. I have a question though - when available software can be private (currently no public availability details are mandated) is it okay to ask for RDI software binary to be downloadable from a public URL? I propose this condition to be added to available category as well.
Also if we enforce a reproducible software and available hardware, we can enable audit for such systems right?
Nvidia would like to see all software being downloadable from public URLs, but I don't think it's realistic to impose that on less forthcoming submitters. Changed it to "by anyone to whom the hardware is Available."
Thank you Dilip for the change. May be Nvidia can propose "Available to public (retail)" (both hardware and software) as a subcategory to "available" as it is clearly different from "available to OEMs".
I agree it would be good to refine the notion of Availability, but not in this PR. Maybe you could start an issue/PR for that, as a basis for working group discussion?
Sure Dilip. I can do that.
WG: Clarify rules around binary components. Otherwise, looks good.
@johntran-nv @erichan1 Can you review this in Training WG?
@DilipSequeira Can we do a change like below?
RDI systems using only Available hardware may use a Reproducible software stack, accompanied by instructions which would allow a reasonable expert user to download and install it on a hardware system similar to the submission system.
RDI systems may not be submitted as Available components until the submission cycle after next or 221 days whichever is longer. This restriction is not applicable to RDI systems using available hardware and a Reproducible software stack
@arjunsuresh excellent suggestion. I've updated the PR accordingly.
@johntran-nv can training please re-review.
Thank you @DilipSequeira
A binary component is Reproducible if it is downloadable by anyone to whom the hardware is Available, from a URL which must remain valid until a release of the software is Available.
I think we need to define what is meant by "a release" here. I imagine a vendor might want to include some optimisations into an engineering build of their driver in the run-up to a submission deadline. Expecting that same engineering build to eventually become Available would be unreasonable - as it may not have gone through a rigorous testing process. Rather, a proper release incorporating the same changes should serve as well.
Let's take a concrete example. Suppose that a submission deadline happens to be on the 1st of February, and the announcement is planned for the 28th of February. Suppose a submitter uses an engineering build u.v.w
to obtain RDI results by the deadline. Suppose they make a proper release u.v
on the 27th of February. Questions:
- Does
u.v.w
need to be available at a URL disclosed to the Review Committee from the 1st to the 27th? (I assume, yes.) - Can the URL be invalidated when
u.v
is released? (I assume, yes.) - Would the result tables need to be updated to refer to
u.v
instead ofu.v.w
? (I assume, maybe.) - Most importantly, should the submitter guarantee that the performance/accuracy with
u.v
is the same as withu.v.w
? (I assume, yes, but how?)
Fair point. What's meant here is "an Available release containing the optimizations" - which for practical purposes means, an available release whose performance on the benchmark is at least as good as the submitted results.
So I would say the answers to your questions are:
Yes, the URL must be disclosed Yes, the URL can be invalidated on release of the software to customers Ideally yes, because otherwise the submission becomes unreproducible once the URL is invalidated. I'm not sure we have a mechanism for this today. Not quite - performance and accuracy must be at least as good as the submitted results, rather than the same.
I don't think it's any part of our goal here to remove the due diligence burden on submitters to ensure that the submission is reproducible in regard to performance and accuracy - just to give submitters the option to submit bleeding edge optimizations somehow. The path that minimizes overall engineering work is to plan such that your software is at least in beta on submission day.
Edit - I'm inclined to suggest that the URL must stay up until the RDI submission is replaced with an Available submission in a subsequent round, with equal or better performance (as that phrase is used in the Preview rules.)
From a submitter point of view I think the proposal can be summarized as follows:
Suppose February 28 is the submission deadline and I have a new staging software on February 21 which I would like to use for submission (provided all runs go as expected). In this case I have the following options.
- Try to make a beta release out of the staging build within a week (before February 28). In this case I can submit my results in the Available category.
- Suppose my current build is not backward compatible or is very specific to a particular model. Then I have 2 options
- Make the software binary available to download and then do a submission under RDI category. In this case, no 221 days rule is applicable.
- Do not make the software binary available and do a submission under RDI category. In this case the 221 days rule is applicable.
The only issue I see here is for the last case. I'm not sure how MLCommons can ensure that the 221 days rule is followed as the software is completely closed (though hardware is available).
Training WG talked about it today. We generally think this is ok. Two questions.
- How would the reproducible tag actually show up in the table? Just "RDI-reproducible" vs. "RDI"?
- What does @TheKanter think? Since you think we shouldn't complicate the categories any more for results viewers.
Does this need to show up in the results table? It could just be a field in the JSON.
RDI is explicitly understood and was designed to support results that are not reproducible already (e.g., enabling internal prototypes). That was it's original intent. E.g., people submitting FPGA prototypes of hardware, internal-only products (e.g., Tesla systems were mentioned).
I am strongly opposed to any additional fragmentation of results. The results presentation is problematic as is already. Results need to be simpler, not more complicated. This will make things worse.
I suggest we regard having a reproducible stack in RDI as a compliance issue (in the sense that you need to have compiled if you're going to submit in Available the next round), and we don't typically capture those in the results table.
@TheKanter @erichan1 If all the concerns are addressed can we please merge this PR? As the rule currently stands it is difficult to do any submission on an available hardware with not released software and in TinyML submissions too we are having issues with it.
I am a bit confused. If a submission is available, then every component needs to be available how could you have an available submission that has available hardware, but unavailable software?
On Wed, May 31, 2023 at 4:16 PM Arjun Suresh @.***> wrote:
@TheKanter https://github.com/TheKanter @erichan1 https://github.com/erichan1 If all the concerns are addressed can we please merge this PR? As the rule currently stands it is difficult to do any submission on an available hardware with not released software and in TinyML submissions too we are having issues with it.
— Reply to this email directly, view it on GitHub https://github.com/mlcommons/policies/pull/126#issuecomment-1571085407, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJXLOK5SKWZZ4HSNBEXJTY3XI7GNLANCNFSM6AAAAAAQ2C747Y . You are receiving this because you were mentioned.Message ID: @.***>
@TheKanter Exactly. So the proposal is to submit such results under RDI category. Since the hardware is available such submissions are exempt from the 221 days rule (if hardware is unavailable an RDI submission cannot be submitted under available category for 221 days).
@TheKanter This has been discussed in both Training and Inference WGs. Do you have any further open questions?
This PR was discussed in the recent chairs sync and prompted the need to review Preview and RDI rules. DavidT will be starting a proposal based on the discussion.