phobos
phobos copied to clipboard
Unicode 14.0.0 support (tables only)
Let's see what's missing and then decide on the new home for the code generator.
The tables were generated by running ./run.d from: https://github.com/DmitryOlshansky/gen-uni-dlang
Thanks for your pull request, @DmitryOlshansky!
Bugzilla references
Auto-close | Bugzilla | Severity | Description |
---|---|---|---|
✓ | 16416 | enhancement | Phobos std.uni out of date (should be updated to latest Unicode standard) |
Testing this PR locally
If you don't have a local development environment setup, you can use Digger to test this PR:
dub run digger -- build "master + phobos#7469"
Oh it's just 42k LoC of diff, LGTM! (/s
)
CI says no:
std/uni/package.d(10621): Error: undefined identifier `toLowerSimpleIndexTrieEntries`
std/uni/package.d(10622): Error: undefined identifier `toTitleSimpleIndexTrieEntries`
std/uni/package.d(10620): Error: undefined identifier `toUpperSimpleIndexTrieEntries`
I think it would be nice if we could have a link to your gen.d file inside the source code as well, like at the top of the module in a comment explaining that this file is auto generated. Maybe in std.internal.unicode_tables
where there is already a header where this could fit or maybe just in all of them a short note could be added
CI says no:
Yeah, and thankfully it’s simple casefolding stuff that’s missing I thought it was something more complicated.
Will get back to it today.
I think it would be nice if we could have a link to your gen.d file inside the source code as well, like at the top of the module in a comment
I think it shall go inside of Phobos repo honestly, as all of the build helpers do. And we should generate the tables IMHO, dropping them from git and add the source *.txt data files from Unicode instead.
The bugzilla issue is: https://issues.dlang.org/show_bug.cgi?id=16416
Not that I'm an expert, but out of curiosity why does the code pass "auto-tester" tests for "Darwin_64_32" but not for the other configurations? Is there a machine dependence in Unicode?
Not that I'm an expert, but out of curiosity why does the code pass "auto-tester" tests for "Darwin_64_32" but not for the other configurations? Is there a machine dependence in Unicode?
As stated in the log file:
echo "Darwin_64_32_disabled"
Darwin_64_32_disabled
Okay, almost there. Will do a few finishing touches and it'll be ready to go.
Speaking of code generator - it'll be tricky to integrate into the Makefile at this point as it needs both 32 and 64 bit tables and by default these use machine word as the last level item. The way out I think is one of two:
- Just use 32-bit uint as the last level table item, it shouldn't matter much in practice.
- Fix things up a bit to be able to use uint or ulong as the last level table item type explicitly.
Whatever it is I think it's out of scope and is not required to get this merged. I'll put the link to the repo in the header of every file but I think it should be transffered to DLang org first.
it'll be tricky to integrate into the Makefile
It'd be preferable to utilise the dmd build.d build scripts of possible.
Also note that the doctester currently fails for all PRs and is unrelated. it is unfortunately required.
Did we switch to build.d script? Coool. I've come out of a fridge after time-traveling for a little while ;)
Spoiler is : D is 2nd most popular language for system-level stuff in 2030. I did not tell anyone that, just a thought )
stated in the log file:
echo "Darwin_64_32_disabled" Darwin_64_32_disabled
Ah, ok... makes it a bit hard to understand/asses the results, if there are tests listed, which fail because they can't run anyway.
So, any idea how to move forward with this PR?
@DmitryOlshansky I cross-checked with Atila how to get this PR into Phobos.
Since this PR is quite extensive and requires a lot of know-how to be reviewed, which Atila doesn't have, we agreed on, that if the PR passes the tests, it will be merged.
I'm confused by the results of the checks and if this needs further action or not. Does anyone know or can find out?
@atilaneves The PR states that merging can be done automatically if it's approved. Not sure if all the tests need to pass first.
Not sure if all the tests need to pass first.
They do.
@thewilsonator Can you enlighten me on how you recognize this? I see 6 failing and 2 successful checks. Where two marked "require" failed.
The ones marked "required" must pass. Also any breakage on buildkite, where community projects are tested, should also be dealt with. I presume the azure failures will get fixed as part of the rest.
To clarify: in general all CIs must pass, but some have the tendency to regularly break due to a various reasons (e.g. buildkite when someone updates a package), hence they aren't marked as required so that in such an event or spurious failure a reviewer can still merge the PR. So the required CIs are just the ones which generally don't have spurious failures.
Ok, thanks.
The auto-tester seems to run every day not only on-demand after a change. I don't understand why sometimes a Linux round is in the list and it's missing on newer rounds... All very confusing or at least not straight forward to understand.
And I understand that only the Darwin_64_64 case fails with this error:
std/uni/package.d(6709): Error: static assert: "No unicode set by name Currency_Symbol was found."
std/uni/package.d(757): instantiated from here: `opDispatch!"Currency_Symbol"`
@DmitryOlshansky Is there a file (for a table) missing? It would be great if you could have a look so this PR is not hanging around for too long.
Okay, I’ve been out for a while, will fix missing aliases shortly.
@DmitryOlshansky What is the status of this PR?
@DmitryOlshansky What is the status of this PR?
@RazvanN7 I'm trying to revive it. I think the codegen is fixed, now what's left is unittests depending on particular sets of characters that probably got extended.
Ping @DmitryOlshansky
Taken over by Rikki https://github.com/dlang/phobos/pull/8617