act-rules.github.io
act-rules.github.io copied to clipboard
Visible label is part of accessible name (2ee8b8): Expectation seems to have unintended consequences
The expectation currently reads:
For each target element, all text nodes in the visible text content either match or are contained within the accessible name of this target element, except for characters in the text nodes used to express non-text content. Leading and trailing whitespace and difference in case sensitivity should be ignored.
According to this expectation, something like the following passes the rule:
<button aria-label="How are you">
<span>you</span>
<span>How</span>
<span>are</span>
</button>
That seems a little odd 🤔 Shouldn't the rule instead be looking at the concatenation of the data of the relevant text nodes?
Summoning @WilcoFiers as you authored #1419.
Both Alfa, aXe, and QualWeb concatenate the text nodes so I'm guessing we'll want to reflect that in the rule 🙈 That does bring up an interesting case though for code such as this:
<div role="button" aria-label="Hello world">
<p>Hello</p><p>world</p>
</div>
That button visually renders as two separate words, Hello
and world
, but the concatenated text node data is Helloworld
. We're currently seeing a handful of cases like this across customer sites in Siteimprove and I'm leaning towards considering them false positives. A more realistic case, which causes issues when minified, is this:
<a href="#" aria-label="Some article by John Doe">
<h6>Some article</h6>
<p>by John Doe</p>
</a>
When minified, the concatenated text will be Some articleby John Doe
.
I tried to dig up the use case for this, but I can't find it. I remember why we did it though. If we're concatenating, we need to make the assumption that the text will be part of the same piece of text, and that it isn't rearranged with CSS in some way to appear in a different order.
Example 2 of Failure technique F96 seems to close the case of adding text in the middle of a label to create an accessible name (which, as far as I understand, is the shopping cart example @WilcoFiers mentioned during call):
A download link reads "Download specification" but there is invisible link text so that the accessible name of that link is "Download gizmo specification". While the visible label text is contained in the accessible name, there is no string match which may prevent the link from being activated by speech input.
<a href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
I have some ideas on this. Maybe I could volunteer. It seems to me that this rule needs a normalization algorithm, and to run it on both the label and the name, then do a substring check. Something like this:
To normalize a label or name:
- Concatenate all text nodes
- (Do this only for the label, not the name.)
- For each HTML element start/stop, insert a space.
- Replace each non-text character (eg. punctuation, emoji) with a space
- Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
- Insert a space before and after each digit
- Replace each run of multiple spaces with single space
- i.e. do a regex replacement like s/ +/ /g
Then do the check "is the normalized label a substring of the normalized name"?
I'll have to check how the above algorithm behaves on all the cases in https://github.com/act-rules/act-rules.github.io/issues/1615. I think that this algorithm will do okay and err on the side of 'no false positives'. There are cases there which will fail the rule according to this algorithm, and which speech-to-text accepts. That might be unavoidable.
I think that this should globally work.
- For each HTML element start/stop, insert a space.
This might actually depends on its display
or something like that. <span>He</span><span>llo</span>
should not have extra spaces. Which ends up being a tricky problem, also for accessible name computation: https://github.com/w3c/accname/issues/15 🙈
That's an interesting discussion over there at the W3C. Based on that, how about something like this:
Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)
To normalize a string:
- Replace each non-text character (eg. punctuation, emoji) with a space
- Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
- Insert a space before and after each digit
- Replace each run of whitespace (of all kinds) with single space character
- i.e. do a regex replacement like s/\s+/ /g
Then do the check "is the normalized 'label' a substring of the normalized 'name'"?
Here's an updated idea for an algorithm (to deal with this case: <a href="#" aria-label="Discover Italy">Discover it</a>
):
Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)
Algorithm to tokenize a string:
- Replace each non-text character (eg. punctuation, emoji) with a space
- Judgement of "non-text" can't be fully automated eg. "X" for "close", "+" for "zoom in"
- Insert a space before and after each digit
- Split the string into a list of strings, using a whitespace regex as the separator.
Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?
Updating my idea for an algorithm again. And adding some test cases, from this issue and others.
Test cases:
-
<a href="#" aria-label="Discover Italy">Discover it</a>
- Desired behaviour: fail this rule
-
<a href="#" aria-label="non-standard">nonstandard</a>
- Desired behaviour: fail this rule
- Inspired by this ACT TF discussion on hyphens
-
<div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
- Desired behaviour: pass this rule
-
<a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
- Desired behaviour: pass this rule
-
<a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
- Desired behaviour: fail this rule
- The "accessibly-hidden" class should:
- Use "clip: rect(0 0 0 0);" and so on, like this visually-hidden class example
- Not use "display: none" nor "visibility: hidden".
-
<a aria-label="Download specification" href="#">Download <span style="display: none">gizmo</span> specification</a>
- Desired behaviour: pass this rule
-
<button aria-label="anything">X</button>
- Desired behaviour: pass this rule
-
<a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
- Desired behaviour: pass this rule
-
<a aria-label="just ice" href="#">justice</a>
- Desired behaviour: fail this rule
-
<a aria-label="justice" href="#">just ice</a>
- Desired behaviour: fail this rule
-
<a aria-label="WAVE" href="#">W A V E</a>
- Desired behaviour: fail this rule
-
<button aria-label="Next Page in the list">Next Page</button>
- Desired behaviour: pass this rule
-
<a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
- Desired behaviour: fail this rule
-
<a href="#2021" aria-label="20 21">2021</a>
- Desired behaviour: fail this rule
-
<a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
- Desired behaviour: fail this rule
-
<a aria-label="1a" href="#">1</a>
- Desired behaviour: pass this rule
-
<a aria-label="compose email" href="#">Compose <br> email</a>
- Desired behaviour: pass this rule
-
<a aria-label="two zero two three" href="#">2 0 2 3</a>
- Desired behaviour: fail this rule
Algorithm:
Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)
Algorithm to tokenize a string:
- For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
- For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
- For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]".
- Insert a space character before and after each digit
- As per the unicode class "Number, Decimal Digit [Nd]".
- Split the string into a list of strings, using a whitespace regex as the separator.
- This 'split' operation should:
- Effectively remove leading and trailing whitespace as a pre-processing step.
- If the string was all whitespace before this operation: result in an empty list.
- This 'split' operation should:
Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?
- This 'sublist' check has these properties:
- It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
- An empty list is a sublist of any list.
Another draft:
Test cases:
-
<a href="#" aria-label="Discover Italy">Discover it</a>
- Desired behaviour: fail this rule
-
<a href="#" aria-label="non-standard">nonstandard</a>
- Desired behaviour: fail this rule
- Inspired by this ACT TF discussion on hyphens
-
<button aria-label="how are you"><span>you</span><span>how</span><span>are</span></button>
- Desired behaviour: fail this rule
-
<button aria-label="AbCdE">aBcDe</button>
- Desired behaviour: pass this rule
-
<div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
- Desired behaviour: pass this rule
-
<a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
- Desired behaviour: pass this rule
-
<a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
- Desired behaviour: fail this rule
- The "accessibly-hidden" class should:
- Use "clip: rect(0 0 0 0);" and so on, like this visually-hidden class example
- Not use "display: none" nor "visibility: hidden".
-
<a aria-label="Download specification" href="#">Download <span style="visibility: hidden">the</span> <span style="display: none">gizmo</span> specification</a>
- Desired behaviour: pass this rule
-
<button aria-label="anything">X</button>
- Desired behaviour: pass this rule
-
<a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
- Desired behaviour: pass this rule
-
<a aria-label="just ice" href="#">justice</a>
- Desired behaviour: fail this rule
-
<a aria-label="justice" href="#">just ice</a>
- Desired behaviour: fail this rule
-
<a aria-label="WAVE" href="#">W A V E</a>
- Desired behaviour: fail this rule
-
<button aria-label="Next Page in the list">Next Page</button>
- Desired behaviour: pass this rule
-
<a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
- Desired behaviour: pass this rule
-
<a href="#2021" aria-label="20 21">2021</a>
- Desired behaviour: pass this rule
-
<a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
- Desired behaviour: fail this rule
-
<a aria-label="1a" href="#">1</a>
- Desired behaviour: pass this rule
-
<a aria-label="compose email" href="#">Compose <br> email</a>
- Desired behaviour: pass this rule
-
<a aria-label="two zero two three" href="#">2 0 2 3</a>
- Desired behaviour: fail this rule
Algorithm:
Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)
Algorithm to tokenize a string:
- Convert the string to lower case.
- For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
- For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
- For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.)
- Insert a space character before and after each digit
- As per the unicode class "Number, Decimal Digit [Nd]".
- Split the string into a list of strings, using a whitespace regex as the separator.
- This 'split' operation should:
- Effectively remove leading and trailing whitespace as a pre-processing step.
- If the string was all whitespace before this operation: result in an empty list.
- This 'split' operation should:
Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?
- This 'sublist' check has these properties:
- It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
- An empty list is a sublist of any list.
@dan-tripp-siteimprove in the last CG group meeting we agreed to update some of those examples. These are the ones:
-
<a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
- Desired behaviour: fail this rule
-
<a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
- Desired behaviour: fail this rule
-
<a href="#2021" aria-label="20 21">2021</a>
- Desired behaviour: fail this rule
-
<a aria-label="1a" href="#">1</a>
- Desired behaviour: fail this rule
The reasoning is summarised in Jean-Yves' comment on issue 1615
Okay, I think I'm starting to get it. Thank you. I'll try to follow up soon.
Here's another draft, in light of these recent discussions: https://github.com/act-rules/act-rules.github.io/issues/1458#issuecomment-1544175199 https://github.com/w3c/wcag/pull/2725/files
Test cases:
-
<a href="#" aria-label="Discover Italy">Discover it</a>
- Desired behaviour: fail this rule
-
<a href="#" aria-label="non-standard">nonstandard</a>
- Desired behaviour: fail this rule
- Motivation: this ACT TF discussion on hyphens
-
<button aria-label="how are you"><span>you</span><span>how</span><span>are</span></button>
- Desired behaviour: fail this rule
-
<button aria-label="AbCdE">aBcDe</button>
- Desired behaviour: pass this rule
-
<div role="button" aria-label="Hello world"><p>Hello</p><p>world</p></div>
- Desired behaviour: pass this rule
-
<a href="#" aria-label="Some article by John Doe"><h6>Some article</h6><p>by John Doe</p></a>
- Desired behaviour: pass this rule
-
<a aria-label="Download specification" href="#">Download <span class="accessibly-hidden">gizmo</span> specification</a>
- Desired behaviour: fail this rule
- The "accessibly-hidden" class should:
- Use "clip: rect(0 0 0 0);" and so on, like this visually-hidden class example
- Not use "display: none" nor "visibility: hidden".
-
<a aria-label="Download specification" href="#">Download <span style="visibility: hidden">the</span> <span style="display: none">gizmo</span> specification</a>
- Desired behaviour: pass this rule
-
<button aria-label="anything">X</button>
- Desired behaviour: pass this rule
-
<a aria-label="Call 1 2 3. 4 5 6. 7 8 9 0." href="tel:1234567890">123.456.7890</a>
- Desired behaviour: fail this rule
-
<a aria-label="just ice" href="#">justice</a>
- Desired behaviour: fail this rule
-
<a aria-label="justice" href="#">just ice</a>
- Desired behaviour: fail this rule
-
<a aria-label="WAVE" href="#">W A V E</a>
- Desired behaviour: fail this rule
-
<button aria-label="Next Page in the list">Next Page</button>
- Desired behaviour: pass this rule
-
<a aria-label="fibonacci: 0 1 1 2 3 5 8 13 21 34">fibonacci: 0112358132134</a>
- Desired behaviour: fail this rule
-
<a href="#2021" aria-label="20 21">2021</a>
- Desired behaviour: fail this rule
-
<a href="#2021" aria-label="twenty twenty-one">two thousand twenty-one</a>
- Desired behaviour: fail this rule
-
<a aria-label="1a" href="#">1</a>
- Desired behaviour: fail this rule
-
<a aria-label="compose email" href="#">Compose <br> email</a>
- Desired behaviour: pass this rule
-
<a aria-label="two zero two three" href="#">2 0 2 3</a>
- Desired behaviour: fail this rule
-
<button aria-label="Search by date">Search by date (YYYY-MM-DD)</button>
- Desired behaviour: pass this rule
- Motivation: this recent WCAG PR
-
<button aria-label="Next">Next…</button>
- Desired behaviour: pass this rule
-
<button aria-label="11 times 3 equals 33">11×3=33</button>
- Desired behaviour: fail this rule
The algorithm below implements all of the "desired behaviours" above correctly, I think.
Algorithm:
Let 'label' be the inner text of the target element as per the innertText algorithm. ('label' is a string.) Let 'name' be the accessible name. ('name' is also a string.)
To tokenize a string:
- Convert the string to lower case.
- For each character that either a) represents non-text content, or b) isn't a letter or a digit: replace that character with a space character.
- For a) Judgement of "non-text" probably can't be fully automated. eg. "X" for "close" probably can be, but presumably there are more cases than this.
- For b) Use the unicode classes Letter, Mark, and "Number, Decimal Digit [Nd]". (This will exclude hyphens, punctuation, emoji, and more.)
- Remove all characters that are within parentheses (AKA round brackets).
- Ignore square brackets and braces.
- Split the string into a list of strings, using a whitespace regex as the separator.
- This 'split' operation should:
- Effectively remove leading and trailing whitespace as a pre-processing step.
- If the string was all whitespace before this operation: result in an empty list.
- This 'split' operation should:
Then do the check: is the tokenized 'label' a sublist of the tokenized 'name'?
- This 'sublist' check has these properties:
- It checks whether elements are consecutive or not. i.e. it checks for a substring, in the computer science sense of the term. Not a subsequence.
- An empty list is a sublist of any list.