policies icon indicating copy to clipboard operation
policies copied to clipboard

Address issue #122 according to Inf WG discussion

Open DilipSequeira opened this issue 2 years ago • 23 comments

This also adds a definition of a Reproducible software component so that RDI does not become a venue for submissions that are not reproducible and can never be reproducible.

DilipSequeira avatar Sep 30 '22 20:09 DilipSequeira

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

github-actions[bot] avatar Sep 30 '22 20:09 github-actions[bot]

Thank you @DilipSequeira for the change. I have a question though - when available software can be private (currently no public availability details are mandated) is it okay to ask for RDI software binary to be downloadable from a public URL? I propose this condition to be added to available category as well.

Also if we enforce a reproducible software and available hardware, we can enable audit for such systems right?

arjunsuresh avatar Oct 01 '22 22:10 arjunsuresh

Nvidia would like to see all software being downloadable from public URLs, but I don't think it's realistic to impose that on less forthcoming submitters. Changed it to "by anyone to whom the hardware is Available."

DilipSequeira avatar Oct 03 '22 19:10 DilipSequeira

Thank you Dilip for the change. May be Nvidia can propose "Available to public (retail)" (both hardware and software) as a subcategory to "available" as it is clearly different from "available to OEMs".

arjunsuresh avatar Oct 03 '22 19:10 arjunsuresh

I agree it would be good to refine the notion of Availability, but not in this PR. Maybe you could start an issue/PR for that, as a basis for working group discussion?

DilipSequeira avatar Oct 03 '22 19:10 DilipSequeira

Sure Dilip. I can do that.

arjunsuresh avatar Oct 03 '22 19:10 arjunsuresh

WG: Clarify rules around binary components. Otherwise, looks good.

tjablin avatar Oct 04 '22 16:10 tjablin

@johntran-nv @erichan1 Can you review this in Training WG?

rnaidu02 avatar Oct 18 '22 16:10 rnaidu02

@DilipSequeira Can we do a change like below?

RDI systems using only Available hardware may use a Reproducible software stack, accompanied by instructions which would allow a reasonable expert user to download and install it on a hardware system similar to the submission system.

RDI systems may not be submitted as Available components until the submission cycle after next or 221 days whichever is longer. This restriction is not applicable to RDI systems using available hardware and a Reproducible software stack

arjunsuresh avatar Nov 18 '22 20:11 arjunsuresh

@arjunsuresh excellent suggestion. I've updated the PR accordingly.

@johntran-nv can training please re-review.

DilipSequeira avatar Nov 18 '22 21:11 DilipSequeira

Thank you @DilipSequeira

arjunsuresh avatar Nov 20 '22 20:11 arjunsuresh

A binary component is Reproducible if it is downloadable by anyone to whom the hardware is Available, from a URL which must remain valid until a release of the software is Available.

I think we need to define what is meant by "a release" here. I imagine a vendor might want to include some optimisations into an engineering build of their driver in the run-up to a submission deadline. Expecting that same engineering build to eventually become Available would be unreasonable - as it may not have gone through a rigorous testing process. Rather, a proper release incorporating the same changes should serve as well.

Let's take a concrete example. Suppose that a submission deadline happens to be on the 1st of February, and the announcement is planned for the 28th of February. Suppose a submitter uses an engineering build u.v.w to obtain RDI results by the deadline. Suppose they make a proper release u.v on the 27th of February. Questions:

  • Does u.v.w need to be available at a URL disclosed to the Review Committee from the 1st to the 27th? (I assume, yes.)
  • Can the URL be invalidated when u.v is released? (I assume, yes.)
  • Would the result tables need to be updated to refer to u.v instead of u.v.w? (I assume, maybe.)
  • Most importantly, should the submitter guarantee that the performance/accuracy with u.v is the same as with u.v.w? (I assume, yes, but how?)

psyhtest avatar Nov 29 '22 17:11 psyhtest

Fair point. What's meant here is "an Available release containing the optimizations" - which for practical purposes means, an available release whose performance on the benchmark is at least as good as the submitted results.

So I would say the answers to your questions are:

Yes, the URL must be disclosed Yes, the URL can be invalidated on release of the software to customers Ideally yes, because otherwise the submission becomes unreproducible once the URL is invalidated. I'm not sure we have a mechanism for this today. Not quite - performance and accuracy must be at least as good as the submitted results, rather than the same.

I don't think it's any part of our goal here to remove the due diligence burden on submitters to ensure that the submission is reproducible in regard to performance and accuracy - just to give submitters the option to submit bleeding edge optimizations somehow. The path that minimizes overall engineering work is to plan such that your software is at least in beta on submission day.

Edit - I'm inclined to suggest that the URL must stay up until the RDI submission is replaced with an Available submission in a subsequent round, with equal or better performance (as that phrase is used in the Preview rules.)

DilipSequeira avatar Nov 29 '22 18:11 DilipSequeira

From a submitter point of view I think the proposal can be summarized as follows:

Suppose February 28 is the submission deadline and I have a new staging software on February 21 which I would like to use for submission (provided all runs go as expected). In this case I have the following options.

  1. Try to make a beta release out of the staging build within a week (before February 28). In this case I can submit my results in the Available category.
  2. Suppose my current build is not backward compatible or is very specific to a particular model. Then I have 2 options
    1. Make the software binary available to download and then do a submission under RDI category. In this case, no 221 days rule is applicable.
    2. Do not make the software binary available and do a submission under RDI category. In this case the 221 days rule is applicable.

The only issue I see here is for the last case. I'm not sure how MLCommons can ensure that the 221 days rule is followed as the software is completely closed (though hardware is available).

arjunsuresh avatar Nov 29 '22 23:11 arjunsuresh

Training WG talked about it today. We generally think this is ok. Two questions.

  1. How would the reproducible tag actually show up in the table? Just "RDI-reproducible" vs. "RDI"?
  2. What does @TheKanter think? Since you think we shouldn't complicate the categories any more for results viewers.

erichan1 avatar Dec 01 '22 16:12 erichan1

Does this need to show up in the results table? It could just be a field in the JSON.

DilipSequeira avatar Dec 01 '22 17:12 DilipSequeira

RDI is explicitly understood and was designed to support results that are not reproducible already (e.g., enabling internal prototypes). That was it's original intent. E.g., people submitting FPGA prototypes of hardware, internal-only products (e.g., Tesla systems were mentioned).

I am strongly opposed to any additional fragmentation of results. The results presentation is problematic as is already. Results need to be simpler, not more complicated. This will make things worse.

TheKanter avatar Dec 01 '22 17:12 TheKanter

I suggest we regard having a reproducible stack in RDI as a compliance issue (in the sense that you need to have compiled if you're going to submit in Available the next round), and we don't typically capture those in the results table.

DilipSequeira avatar Dec 01 '22 18:12 DilipSequeira

@TheKanter @erichan1 If all the concerns are addressed can we please merge this PR? As the rule currently stands it is difficult to do any submission on an available hardware with not released software and in TinyML submissions too we are having issues with it.

arjunsuresh avatar May 31 '23 23:05 arjunsuresh

I am a bit confused. If a submission is available, then every component needs to be available how could you have an available submission that has available hardware, but unavailable software?

On Wed, May 31, 2023 at 4:16 PM Arjun Suresh @.***> wrote:

@TheKanter https://github.com/TheKanter @erichan1 https://github.com/erichan1 If all the concerns are addressed can we please merge this PR? As the rule currently stands it is difficult to do any submission on an available hardware with not released software and in TinyML submissions too we are having issues with it.

— Reply to this email directly, view it on GitHub https://github.com/mlcommons/policies/pull/126#issuecomment-1571085407, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJXLOK5SKWZZ4HSNBEXJTY3XI7GNLANCNFSM6AAAAAAQ2C747Y . You are receiving this because you were mentioned.Message ID: @.***>

TheKanter avatar May 31 '23 23:05 TheKanter

@TheKanter Exactly. So the proposal is to submit such results under RDI category. Since the hardware is available such submissions are exempt from the 221 days rule (if hardware is unavailable an RDI submission cannot be submitted under available category for 221 days).

arjunsuresh avatar May 31 '23 23:05 arjunsuresh

@TheKanter This has been discussed in both Training and Inference WGs. Do you have any further open questions?

nv-ananjappa avatar Jul 21 '23 18:07 nv-ananjappa

This PR was discussed in the recent chairs sync and prompted the need to review Preview and RDI rules. DavidT will be starting a proposal based on the discussion.

mrasquinha-g avatar Oct 17 '23 16:10 mrasquinha-g