act-rules.github.io icon indicating copy to clipboard operation
act-rules.github.io copied to clipboard

Visible label is part of accessible name (2ee8b8): Expectation seems to have unintended consequences

Open kasperisager opened this issue 4 years ago • 12 comments

The expectation currently reads:

For each target element, all text nodes in the visible text content either match or are contained within the accessible name of this target element, except for characters in the text nodes used to express non-text content. Leading and trailing whitespace and difference in case sensitivity should be ignored.

According to this expectation, something like the following passes the rule:

<button aria-label="How are you">
  <span>you</span>
  <span>How</span>
  <span>are</span>
</button>

That seems a little odd 🤔 Shouldn't the rule instead be looking at the concatenation of the data of the relevant text nodes?

Summoning @WilcoFiers as you authored #1419.

kasperisager avatar Sep 23 '20 06:09 kasperisager

Both Alfa, aXe, and QualWeb concatenate the text nodes so I'm guessing we'll want to reflect that in the rule 🙈 That does bring up an interesting case though for code such as this:

<div role="button" aria-label="Hello world">
  <p>Hello</p><p>world</p>
</div>

That button visually renders as two separate words, Hello and world, but the concatenated text node data is Helloworld. We're currently seeing a handful of cases like this across customer sites in Siteimprove and I'm leaning towards considering them false positives. A more realistic case, which causes issues when minified, is this:

<a href="#" aria-label="Some article by John Doe">
  <h6>Some article</h6>
  <p>by John Doe</p>
</a>

When minified, the concatenated text will be Some articleby John Doe.

kasperisager avatar Sep 23 '20 07:09 kasperisager

I tried to dig up the use case for this, but I can't find it. I remember why we did it though. If we're concatenating, we need to make the assumption that the text will be part of the same piece of text, and that it isn't rearranged with CSS in some way to appear in a different order.

WilcoFiers avatar Nov 20 '20 15:11 WilcoFiers

Example 2 of Failure technique F96 seems to close the case of adding text in the middle of a label to create an accessible name (which, as far as I understand, is the shopping cart example @WilcoFiers mentioned during call):

A download link reads "Download specification" but there is invisible link text so that the accessible name of that link is "Download gizmo specification". While the visible label text is contained in the accessible name, there is no string match which may prevent the link from being activated by speech input. <a href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>

Jym77 avatar Nov 25 '21 10:11 Jym77

I have some ideas on this. Maybe I could volunteer. It seems to me that this rule needs a normalization algorithm, and to run it on both the label and the name, then do a substring check. Something like this:

To normalize a label or name:

  • Concatenate all text nodes
    • (Do this only for the label, not the name.)
    • For each HTML element start/stop, insert a space.
  • Replace each non-text character (eg. punctuation, emoji) with a space
    • Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
  • Insert a space before and after each digit
  • Replace each run of multiple spaces with single space
    • i.e. do a regex replacement like s/ +/ /g

Then do the check "is the normalized label a substring of the normalized name"?

I'll have to check how the above algorithm behaves on all the cases in https://github.com/act-rules/act-rules.github.io/issues/1615. I think that this algorithm will do okay and err on the side of 'no false positives'. There are cases there which will fail the rule according to this algorithm, and which speech-to-text accepts. That might be unavoidable.

dan-tripp-siteimprove avatar Mar 23 '23 14:03 dan-tripp-siteimprove

I think that this should globally work.

  • For each HTML element start/stop, insert a space.

This might actually depends on its display or something like that. <span>He</span><span>llo</span> should not have extra spaces. Which ends up being a tricky problem, also for accessible name computation: https://github.com/w3c/accname/issues/15 🙈

Jym77 avatar Mar 23 '23 14:03 Jym77

That's an interesting discussion over there at the W3C. Based on that, how about something like this:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)

To normalize a string:

  • Replace each non-text character (eg. punctuation, emoji) with a space
    • Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
  • Insert a space before and after each digit
  • Replace each run of whitespace (of all kinds) with single space character
    • i.e. do a regex replacement like s/\s+/ /g

Then do the check "is the normalized 'label' a substring of the normalized 'name'"?

dan-tripp-siteimprove avatar Mar 27 '23 22:03 dan-tripp-siteimprove

Here's an updated idea for an algorithm (to deal with this case: <a href="#" aria-label="Discover Italy">Discover it</a>):

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)

Algorithm to tokenize a string:

  • Replace each non-text character (eg. punctuation, emoji) with a space
    • Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
  • Insert a space before and after each digit
  • Split the string into a list of strings, using a whitespace regex as the separator.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

dan-tripp-siteimprove avatar Apr 12 '23 21:04 dan-tripp-siteimprove

Updating my idea for an algorithm again. And adding some test cases, from this issue and others.

Test cases:

  • <a href="#" aria-label="Discover Italy">Discover it</a>
    • Desired behaviour: fail this rule
  • <a href="#" aria-label="non-standard">nonstandard</a>
  • <div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
    • Desired behaviour: pass this rule
  • <a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
    • Desired behaviour: pass this rule
  • <a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
    • Desired behaviour: fail this rule
    • The "accessibly-hidden" class should:
  • <a aria-label="Download specification" href="#">Download <span style="display: none">gizmo</span> specification</a>
    • Desired behaviour: pass this rule
  • <button aria-label="anything">X</button>
    • Desired behaviour: pass this rule
  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: pass this rule
  • <a aria-label="just ice" href="#">justice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="justice" href="#">just ice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="WAVE" href="#">W A V E</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Next Page in the list">Next Page</button>
    • Desired behaviour: pass this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: pass this rule
  • <a aria-label="compose email" href="#">Compose &nbsp;&nbsp;<br> email</a>
    • Desired behaviour: pass this rule
  • <a aria-label="two zero two three" href="#">2 0 2 3</a>
    • Desired behaviour: fail this rule

Algorithm:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)

Algorithm to tokenize a string:

  • For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
    • For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
    • For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]".
  • Insert a space character before and after each digit
    • As per the unicode class "Number, Decimal Digit [Nd]".
  • Split the string into a list of strings, using a whitespace regex as the separator.
    • This 'split' operation should:
      • Effectively remove leading and trailing whitespace as a pre-processing step.
      • If the string was all whitespace before this operation: result in an empty list.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

  • This 'sublist' check has these properties:
    • It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
    • An empty list is a sublist of any list.

dan-tripp-siteimprove avatar Apr 24 '23 16:04 dan-tripp-siteimprove

Another draft:

Test cases:

  • <a href="#" aria-label="Discover Italy">Discover it</a>
    • Desired behaviour: fail this rule
  • <a href="#" aria-label="non-standard">nonstandard</a>
  • <button aria-label="how are you"><span>you</span><span>how</span><span>are</span></button>
    • Desired behaviour: fail this rule
  • <button aria-label="AbCdE">aBcDe</button>
    • Desired behaviour: pass this rule
  • <div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
    • Desired behaviour: pass this rule
  • <a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
    • Desired behaviour: pass this rule
  • <a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
    • Desired behaviour: fail this rule
    • The "accessibly-hidden" class should:
  • <a aria-label="Download specification" href="#">Download <span style="visibility: hidden">the</span> <span style="display: none">gizmo</span> specification</a>
    • Desired behaviour: pass this rule
  • <button aria-label="anything">X</button>
    • Desired behaviour: pass this rule
  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: pass this rule
  • <a aria-label="just ice" href="#">justice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="justice" href="#">just ice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="WAVE" href="#">W A V E</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Next Page in the list">Next Page</button>
    • Desired behaviour: pass this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: pass this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: pass this rule
  • <a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: pass this rule
  • <a aria-label="compose email" href="#">Compose &nbsp;&nbsp;<br> email</a>
    • Desired behaviour: pass this rule
  • <a aria-label="two zero two three" href="#">2 0 2 3</a>
    • Desired behaviour: fail this rule

Algorithm:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)

Algorithm to tokenize a string:

  • Convert the string to lower case.
  • For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
    • For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
    • For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.)
  • Insert a space character before and after each digit
    • As per the unicode class "Number, Decimal Digit [Nd]".
  • Split the string into a list of strings, using a whitespace regex as the separator.
    • This 'split' operation should:
      • Effectively remove leading and trailing whitespace as a pre-processing step.
      • If the string was all whitespace before this operation: result in an empty list.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

  • This 'sublist' check has these properties:
    • It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
    • An empty list is a sublist of any list.

dan-tripp-siteimprove avatar Apr 26 '23 21:04 dan-tripp-siteimprove

@dan-tripp-siteimprove in the last CG group meeting we agreed to update some of those examples. These are the ones:

  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: fail this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: fail this rule

The reasoning is summarised in Jean-Yves' comment on issue 1615

carlosapaduarte avatar May 11 '23 15:05 carlosapaduarte

Okay, I think I'm starting to get it. Thank you. I'll try to follow up soon.

dan-tripp-siteimprove avatar May 15 '23 15:05 dan-tripp-siteimprove

Here's another draft, in light of these recent discussions: https://github.com/act-rules/act-rules.github.io/issues/1458#issuecomment-1544175199 https://github.com/w3c/wcag/pull/2725/files

Test cases:

  • <a href="#" aria-label="Discover Italy">Discover it</a>
    • Desired behaviour: fail this rule
  • <a href="#" aria-label="non-standard">nonstandard</a>
  • <button aria-label="how are you"><span>you</span><span>how</span><span>are</span></button>
    • Desired behaviour: fail this rule
  • <button aria-label="AbCdE">aBcDe</button>
    • Desired behaviour: pass this rule
  • <div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
    • Desired behaviour: pass this rule
  • <a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
    • Desired behaviour: pass this rule
  • <a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
    • Desired behaviour: fail this rule
    • The "accessibly-hidden" class should:
  • <a aria-label="Download specification" href="#">Download <span style="visibility: hidden">the</span> <span style="display: none">gizmo</span> specification</a>
    • Desired behaviour: pass this rule
  • <button aria-label="anything">X</button>
    • Desired behaviour: pass this rule
  • <a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
    • Desired behaviour: fail this rule
  • <a aria-label="just ice" href="#">justice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="justice" href="#">just ice</a>
    • Desired behaviour: fail this rule
  • <a aria-label="WAVE" href="#">W A V E</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Next Page in the list">Next Page</button>
    • Desired behaviour: pass this rule
  • <a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="20 21">2021</a>
    • Desired behaviour: fail this rule
  • <a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
    • Desired behaviour: fail this rule
  • <a aria-label="1a" href="#">1</a>
    • Desired behaviour: fail this rule
  • <a aria-label="compose email" href="#">Compose &nbsp;&nbsp;<br> email</a>
    • Desired behaviour: pass this rule
  • <a aria-label="two zero two three" href="#">2 0 2 3</a>
    • Desired behaviour: fail this rule
  • <button aria-label="Search by date">Search by date (YYYY-MM-DD)</button>
  • <button aria-label="Next">Next…</button>
    • Desired behaviour: pass this rule
  • <button aria-label="11 times 3 equals 33">11×3=33</button>
    • Desired behaviour: fail this rule

The algorithm below implements all of the "desired behaviours" above correctly, I think.

Algorithm:

Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)

To tokenize a string:

  • Convert the string to lower case.
  • For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
    • For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
    • For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.)
  • Remove all characters that are within parentheses (AKA round brackets).
    • Ignore square brackets and braces.
  • Split the string into a list of strings, using a whitespace regex as the separator.
    • This 'split' operation should:
      • Effectively remove leading and trailing whitespace as a pre-processing step.
      • If the string was all whitespace before this operation: result in an empty list.

Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?

  • This 'sublist' check has these properties:
    • It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
    • An empty list is a sublist of any list.

dan-tripp-siteimprove avatar May 19 '23 18:05 dan-tripp-siteimprove