phobos icon indicating copy to clipboard operation
phobos copied to clipboard

Support of byUTF for ubyte[] argument

Open vporton opened this issue 4 years ago • 6 comments

For the reasons outlined in the discussion of that pull request, we concluded that we need to be able to call byUTF on the argument of type ubyte[]. This PR implements exactly that.

I remind that this is necessary:

  1. to support converting a chunk extracted from a file;
  2. to eliminate the need to validate a string of chars two times: when it is created and when converted by byUTF (this simplifies programming and improves efficiency).

vporton avatar Oct 23 '19 20:10 vporton

Thanks for your pull request and interest in making D better, @vporton! We are looking forward to reviewing it, and you should be hearing from a maintainer soon. Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#7249"

dlang-bot avatar Oct 23 '19 20:10 dlang-bot

What happens for a range of ubytes that aren't valid UTF-8? Is this covered by a test?

lesderid avatar Oct 23 '19 20:10 lesderid

I read the discussion on the other PR and AIUI it called for something that took ubyte[] and lazily produced validaded char[]. This doesn't seem to be it. Could you please explain what you're trying to accomplish? Thanks.

atilaneves avatar Oct 30 '19 10:10 atilaneves

I read the discussion on the other PR and AIUI it called for something that took ubyte[] and lazily produced validaded char[]. This doesn't seem to be it. Could you please explain what you're trying to accomplish? Thanks.

The added unittest explains it: assert((cast(ubyte[]) [0x68, 0x65, 0x6c, 0x6c, 0xC3, 0xB6]).byUTF!char().equal(['h', 'e', 'l', 'l', 0xC3, 0xB6]));

You pass in a range of ubyte and get a range of char. Handy, as no separate step to deal with autodecoding is needed.

dukc avatar Apr 27 '20 18:04 dukc

Perhaps this should also accept ushort (assumed to be UTF-16) and uint (UTF-32)?

Of course, this can be implemented later on just as well.

dukc avatar Apr 27 '20 19:04 dukc

ping @vporton

RazvanN7 avatar Apr 21 '21 09:04 RazvanN7