webref icon indicating copy to clipboard operation
webref copied to clipboard

Look for duplicate exported dfns

Open tidoust opened this issue 4 years ago • 10 comments

Some specs define and export terms that are defined elsewhere, see e.g. discussion in https://github.com/w3c/respec/issues/3687

There may be valid cases where this is needed. It would be good to first look at exported dfns and list terms that are defined more than once.

tidoust avatar Jul 15 '21 09:07 tidoust

See also https://github.com/w3c/aria/pull/1510

marcoscaceres avatar Jul 15 '21 09:07 marcoscaceres

Here is a first dump of exported terms defined in more than one spec. This is the result of parsing the dfns extracts with the following rules:

  1. Only consider exported dfns (dfn.access === 'public')
  2. Only consider root dfns (dfn.for.length === 0)
  3. Consider that dfns in CSS modules override the same dfns in CSS 2 and SVG 2
  4. Consider that css-align dfns override those in css-flexbox
  5. Consider that css-position dfns override those in css-logical
  6. Ignore duplicate IDL terms, we know about them and fix them already through IDL patches (although note we don't yet fix them in dfns extracts)
  7. Ignore duplicate elements in HTML and SVG 2 (the a, link, style, title elements)

These rules take care of most but not all of the duplicate CSS properties identified in https://github.com/w3c/webref/issues/127#issuecomment-874135280.

Looking at the results, there are a number of cases where the duplicates are warranted (e.g. the notion of attribute in DOM is different from the notion of attribute in Web IDL). There are cases where specs need fixing (e.g. wai-aria should reference Infra for user agent, conformant server should not be exported, Trusted Types has a few weird constructs). And there are cases that are questionable (e.g. the notion of character? Why does Page Lifecycle re-define discard, is it an extended dfn?)

Duplicate dfns with same type (27 found)
Duplicate dfns with different types (54 found)

tidoust avatar Jul 18 '21 16:07 tidoust

in the list of same name, different types, there were 6 were the duplicate came from the same spec (Trusted Types), due to a bug in the markup - PR filed at https://github.com/w3c/webappsec-trusted-types/pull/348

re "character" in the first list, it's interesting that the visible term in Infra is actually "code point", with character only provided as an alternate linking text - maybe one that should be deprecated given the ambiguity of the term.

~~I'm surprised the 2nd list is so short; for instance, width is also used in HTML, SVG, as a mediastreamtrack constraint, etc.~~ I guess these are filtered out by the "root-only dfns" rule.

dontcallmedom avatar Aug 05 '21 08:08 dontcallmedom

Note code that produces the previous report is in a dedicated Reffy branch: https://github.com/w3c/reffy/blob/study-dfns/src/cli/study-dfns.js

tidoust avatar Aug 13 '21 12:08 tidoust

I leave a note here but please tell me if you prefer that I open a dedicated issue.

<rgb()> and <hsl()> are defined with a legacy syntax that @webref/css does not extract. The legacy syntax defines a comma separated list of arguments while the modern syntax defines a whitespace/slash separated list. The legacy syntax will never be removed, I presume, and I would bet many CSS authors use the legacy syntax instead of the new one.

<!-- Markup for the modern syntax -->
<pre class="prod highlight">
    <dfn class="dfn-paneled" data-dfn-type="function" data-export="" id="funcdef-hsl">
        <c- nf="">hsl</c-><c- p="">()</c->
    </dfn>
    <!-- ... -->
</pre>
<!-- Markup for the legacy syntax -->
<pre class="prod highlight">
    <c- nf="">hsl</c-><c- p="">()</c->
    <!-- ... -->
</pre>

I'm not sure what to suggest and where to suggest it. I believe that Bikeshed should allow defining a legacy syntax that @webref/css could assign to an additional legacy for an entry in valuespace, but to my knowledge, these are the only types that have two value definitions defined in the same spec.

cdoublev avatar Apr 19 '22 10:04 cdoublev

Thanks @cdoublev, I moved your comment to a dedicated issue #563.

tidoust avatar Apr 19 '22 16:04 tidoust

Refreshing the list of duplicate definitions and completing with the list of IDL terms that are defined more than once (before patches are applied):

Duplicate dfns with same type (64 found)
Duplicate dfns with different types (64 found)
Duplicate IDL dfns (11 found)

tidoust avatar Dec 16 '22 12:12 tidoust

Reflecting on the duplicate IDL dfns:

  • DOMParser: definition in Trusted Types appears in an IDL block flagged with an "exclude" class. Definitions should probably not be extracted at all.
  • MessageEventSource, PermissionState: handled in IDL patch. Could be hardcoded in Reffy.
  • MutationEvent: legacy, defined as removed in DOM, and deprecated in UI Events. Could be hardcoded in Reffy.
  • RTCStats: definition in WebRTC Stats is informative (but exported).
  • EvalError, RangeError, ReferenceError, SyntaxError, TypeError, URIError: Shouldn't Web IDL reference definitions in ECMAScript rather than define the terms? Or do we want to prefer Web IDL definitions because they are closer to relevant algorithms for Web specs?

tidoust avatar Dec 16 '22 13:12 tidoust

RTCStats: definition in WebRTC Stats is informative (but exported).

Marking the definition as non-exported should be an easily accepted patch there

EvalError, RangeError, ReferenceError, SyntaxError, TypeError, URIError: Shouldn't Web IDL reference definitions in ECMAScript rather than define the terms? Or do we want to prefer Web IDL definitions because they are closer to relevant algorithms for Web specs?

From the discussion in https://github.com/w3c/reffy/pull/732, I believe it was indeed preferred to keep WebIDL as an intermediary, although I'm not sure how deeply discussed this was.

dontcallmedom avatar Dec 16 '22 16:12 dontcallmedom

Duplicate dfns with same type

Some of these are different concepts altogether of terms that happen to be the same across specs (e.g. an attribute in DOM is quite different from an attribute in WebIDL); some of these are for concepts that are pretty close but not fully equivalent definitions (e.g. surrogate in Infra & I18N glossary); some of these are explicitly duplicate definitions that happen to be repeated for mostly logistical / editorial reasons (e.g. the overlap between rdf concepts & rdf semantics; the duplicates across CSS modules).

I think these 3 categories probably lead to different type of solutions:

  • for the 1st one, namespacing with a dfn-for the definitions would be an obvious candidate - not sure if that would work across the board though
  • for the 2nd one, trying to get convergence on a single definition in a single place would be the ideal approach; if not possible, trying to encourage using different terms would be an alternative
  • for the 3rd one, we would probably want a mechanism to describe this equivalence relationship - this would allow in ReSpec and Bikeshed to avoid the duplicate warning; in ReSpec, this would allow using either specs in the xref setting and still be able to use the relevant definition

dontcallmedom avatar Dec 16 '22 16:12 dontcallmedom