unicodetools
unicodetools copied to clipboard
why does security/.../removals.txt not work with Age?
I just added the following to the security data input file removals.txt (PR #777). Why does the Age property not seem to work here?
I also tried a simple \p{Age=16} without the intersection with the list of scripts. No effect either.
@macchiati ideas?
# PAG meeting 2024-04-18 before Unicode 16 beta:
# [Mark]: Policy is that by default
# new characters in scripts that are not Excluded or Limited Use,
# are marked as Uncommon_Use & communicate to SEW
# to ask if there are any exceptions (needed in customary modern widespread use).
# ----
# https://www.unicode.org/reports/tr31/#Table_Recommended_Scripts
# ----
# TODO: This should work with the following set pattern but doesn't;
# and neither with \p{Age=16}. Why?
# [\P{Age=15.1}&[\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}]] ; uncommon_use
I'll have to look at it
On Thu, May 2, 2024, 21:18 Markus Scherer @.***> wrote:
I just added the following to the security data input file removals.txt (PR #777 https://github.com/unicode-org/unicodetools/pull/777). Why does the Age property not seem to work here? I also tried a simple \p{Age=16} without the intersection with the list of scripts. No effect either. @macchiati https://github.com/macchiati ideas?
PAG meeting 2024-04-18 before Unicode 16 beta:
[Mark]: Policy is that by default
new characters in scripts that are not Excluded or Limited Use,
are marked as Uncommon_Use & communicate to SEW
to ask if there are any exceptions (needed in customary modern widespread use).
----
https://www.unicode.org/reports/tr31/#Table_Recommended_Scripts
----
TODO: This should work with the following set pattern but doesn't;
and neither with \p{Age=16}. Why?
[\P{Age=15.1}&[\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}]] ; uncommon_use
— Reply to this email directly, view it on GitHub https://github.com/unicode-org/unicodetools/issues/800, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJLEMEREYWGVFHUFFYORQ3ZAMF2ZAVCNFSM6AAAAABHE27LK6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TMOBWGU2DOMQ . You are receiving this because you were mentioned.Message ID: @.***>
The following works to get just characters > version 14.0 and ≤ 15.1.
[\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt} &\p{Age=15.1} -\p{Age=14.0}]
BTW, the following is a more concise way to list the scripts, if you are using modified version of UnicodeSet in the unicodetools
\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}
==>
\p{script=/Zyyy|Zinh|Arab|Armn|Beng|Bopo|Cyrl|Deva|Ethi|Geor|Grek|Gujr|Guru|Hang|Hani|Hebr|Hira|Kana|Knda|Khmr|Laoo|Latn|Mlym|Mymr|Orya|Sinh|Taml|Telu|Thaa|Thai|Tibt/}
I often wish we had that in the stock ICU...
On Fri, May 3, 2024 at 7:17 AM Mark Davis Ⓤ @.***> wrote:
I'll have to look at it
On Thu, May 2, 2024, 21:18 Markus Scherer @.***> wrote:
I just added the following to the security data input file removals.txt (PR #777 https://github.com/unicode-org/unicodetools/pull/777). Why does the Age property not seem to work here? I also tried a simple \p{Age=16} without the intersection with the list of scripts. No effect either. @macchiati https://github.com/macchiati ideas?
PAG meeting 2024-04-18 before Unicode 16 beta:
[Mark]: Policy is that by default
new characters in scripts that are not Excluded or Limited Use,
are marked as Uncommon_Use & communicate to SEW
to ask if there are any exceptions (needed in customary modern widespread use).
----
https://www.unicode.org/reports/tr31/#Table_Recommended_Scripts
----
TODO: This should work with the following set pattern but doesn't;
and neither with \p{Age=16}. Why?
[\P{Age=15.1}&[\p{script=Zyyy}\p{script=Zinh}\p{script=Arab}\p{script=Armn}\p{script=Beng}\p{script=Bopo}\p{script=Cyrl}\p{script=Deva}\p{script=Ethi}\p{script=Geor}\p{script=Grek}\p{script=Gujr}\p{script=Guru}\p{script=Hang}\p{script=Hani}\p{script=Hebr}\p{script=Hira}\p{script=Kana}\p{script=Knda}\p{script=Khmr}\p{script=Laoo}\p{script=Latn}\p{script=Mlym}\p{script=Mymr}\p{script=Orya}\p{script=Sinh}\p{script=Taml}\p{script=Telu}\p{script=Thaa}\p{script=Thai}\p{script=Tibt}]] ; uncommon_use
— Reply to this email directly, view it on GitHub https://github.com/unicode-org/unicodetools/issues/800, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACJLEMEREYWGVFHUFFYORQ3ZAMF2ZAVCNFSM6AAAAABHE27LK6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TMOBWGU2DOMQ . You are receiving this because you were mentioned.Message ID: @.***>