phobos
phobos copied to clipboard
Support of byUTF for ubyte[] argument
For the reasons outlined in the discussion of that pull request, we concluded that we need to be able to call byUTF
on the argument of type ubyte[]
. This PR implements exactly that.
I remind that this is necessary:
- to support converting a chunk extracted from a file;
- to eliminate the need to validate a string of
char
s two times: when it is created and when converted bybyUTF
(this simplifies programming and improves efficiency).
Thanks for your pull request and interest in making D better, @vporton! We are looking forward to reviewing it, and you should be hearing from a maintainer soon. Please verify that your PR follows this checklist:
- My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
- My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
- I have provided a detailed rationale explaining my changes
- New or modified functions have Ddoc comments (with
Params:
andReturns:
)
Please see CONTRIBUTING.md for more information.
If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.
Bugzilla references
Your PR doesn't reference any Bugzilla issue.
If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.
Testing this PR locally
If you don't have a local development environment setup, you can use Digger to test this PR:
dub fetch digger
dub run digger -- build "master + phobos#7249"
What happens for a range of ubyte
s that aren't valid UTF-8? Is this covered by a test?
I read the discussion on the other PR and AIUI it called for something that took ubyte[]
and lazily produced validaded char[]
. This doesn't seem to be it. Could you please explain what you're trying to accomplish? Thanks.
I read the discussion on the other PR and AIUI it called for something that took
ubyte[]
and lazily produced validadedchar[]
. This doesn't seem to be it. Could you please explain what you're trying to accomplish? Thanks.
The added unittest explains it:
assert((cast(ubyte[]) [0x68, 0x65, 0x6c, 0x6c, 0xC3, 0xB6]).byUTF!char().equal(['h', 'e', 'l', 'l', 0xC3, 0xB6]));
You pass in a range of ubyte
and get a range of char
. Handy, as no separate step to deal with autodecoding is needed.
Perhaps this should also accept ushort
(assumed to be UTF-16) and uint
(UTF-32)?
Of course, this can be implemented later on just as well.
ping @vporton