David Corbett

Results 61 comments of David Corbett

How about this? 1. Substitute every capital letter with its lowercase form followed by `uppercase`, an invisible mark glyph. 2. Ligate sequences of lowercase letters, skipping marks. No classes are...

> > The voiceless geminate ⟨þþ⟩ or ⟨ðð⟩ becomes ⟨ðð⟩. > > By 'becomes' do you mean 'should become'? Because i get 'θθ' for both of those. That is true...

Anything using `characters_per_script` should have the `network` condition, because it uses youseedee, which downloads Unicode files at runtime.

`pip install unicodedataplus` failed for me on macOS 14.4.1 with “error: incompatible pointer to integer conversion returning 'void *' from a function with result type 'int' [-Wint-conversion]” while the installation...

[Some lowercase letters](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5B%5B%3ALl%3A%5D-%5B%3AChanges_When_Uppercased%3A%5D%5D&g=&i=) don’t have uppercase counterparts or [vice versa](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5B%5B%3ALu%3A%5D-%5B%3AChanges_When_Lowercased%3A%5D%5D&g=&i=), in which case 'smcp' and 'c2sc' can’t be assumed to be meaningful. Instead of checking a character’s general category, you...

It would also be good to test triple-quoted strings whose final characters match the outer quotation marks: ```python r"""​\"""" r'''​\'''' ``` The `raw_nested_fstrings` tests don’t include any of the relevant...

My previous example was incomplete: quotation marks in triple-quoted strings need escaping when they precede two more of the same quotation mark. This can also happen in the middle of...

I think the rules should either offer safe fixes for raw strings or no fixes. Keeping the current fixes but marking them unsafe doesn’t seem useful, because they are incorrect.

If a font supports Hebrew but does not use the script tag 'hebr' in GPOS, HarfBuzz ignores GPOS and uses fallback mark positioning.