utf8-all
utf8-all copied to clipboard
is there a way to use utf8::all in other 3rd party modules?
e.g.
I am using the Path::Tiny module and that module uses readdir
sub. Is there a way to make it to respect of utf8::all without modifying its source code?
currently, i have to modify the code and add use utf8::all
right after the declaration of package Path::Tiny
If you use utf8::all
in your own code, the changes that are brought by it are (also) active for Path::Tiny
(unless it overrides what is overridden by utf8::all
again, but that is not the case as far as I can tell).
If this doesn't this work for you, can you show me an example where this doesn't work?
My folder structure is like this:
├── test.pl └── 中文目录 └── 中文文件.txt
It outpus correct utf8 filenames with following code (when debug, I saw it enters the _utf8_readdir
):
use v5.18;
use utf8::all 'GLOBAL';
opendir DIR, "./中文目录";
for (readdir DIR) {
say $_;
}
closedir DIR;
I got
.
..
中文文件.txt`
but it went wrong when I try to use Path::Tiny (it never enters the replacement sub for readdir)
use v5.18;
use utf8::all 'GLOBAL';
use Path::Tiny;
my $path = path("./中文目录");
for ( $path->children ) {
say $_;
}
I got
中文目录/ä¸ææ件.txt
As far as I see, the children
sub in Path::Tiny
doesn't respect of the replacement of readdir
. The $target as utf8::all replace subs for is always main
unless I modify the code of Path::Tiny
to something like below.
# /Library/Perl/5.18/Path/Tiny.pm
.
.
.
package Path::Tiny;
use utf8::all 'global';
.
.
.
If you use utf8::all in your own code, the changes that are brought by it are (also) active for Path::Tiny
Pretty sure it isn't; the only global effects are utf8ness STDIN
/STDOUT
/STDERR
and @ARGV
. Making it global would be possible, but that kind of monkey patching is rarely a good idea.
@Leont , so what's the best way to make 3rd party package to respect of utf8::all ? It annoys to have to modify code from cpan only to enable the utf8 feature.
Hmm, @Leont may have a point; while, readdir
is overridden (just like readlink
and glob
), it may be that it is limited to the current module. I'll have a look at the issue and see if I can come up with something that works and isn't too hard to implement in your own code.
I've been trying a couple of things (e.g. using Import::Into
, an eval
construct, overriding the readdir
function manually, etc.), but I can't get it to work ☹️. @Leont do you have a suggestion?
I've been trying a couple of things (e.g. using Import::Into, an eval construct, overriding the readdir function manually, etc.), but I can't get it to work :frowning_face:
utf8 readdir
is guarded by the lexical pragma, and since the scope that calls it (inside Path::Tiny) doesn't have it enabled the override will be a noop. This is intentional.
eg. I could solve it if I can rewrite PATH :: Tiny to NewTiny. FYI. I am glad if it is helpful, I am sorry if it is out of order.
test_exe utf8_test.pl
#!/usr/bin/perl -W
# -C64
#@ test utf-8 chinese japanese
use strict;
use v5.26;
use PathJ::TinyJ;
# use Path::Tiny;
use utf8::all 'GLOBAL';
#use Encode::JP;
# use locale;
# use utf8;
my $path = path("./中文目录");
for ( $path->children ) {
say $_;
}
__END__
Path :: Tiny to PathJ :: TinyJ file PathJ/TinyJ.pm
use 5.008001;
use strict;
use warnings;
use utf8::all;
package PathJ::TinyJ;
# ABSTRACT: File path utility
our $VERSION = '0.104J';#JAPANESE
sub children {
my ( $self, $filter ) = @_;
my $dh;
opendir $dh, $self->[PATH] or $self->_throw('opendir');
my @children = readdir $dh;
closedir $dh or $self->_throw('closedir');
use Encode;#★★★★★★★★★★★★変更 add
@children = map { decode('utf-8',$_ ) } @children;#★★★★★★★★★★★★変更 add
if ( not defined $filter ) {
New TinyJ.pm $ ./utf8_test.pl 中文目录/中文文件.txt 中文目录/日本語文章.txt
Normal Tiny.pm valid pathname, invalid broken basename. $ ./utf8_test.pl 中文目录/ä¸ææ件.txt 中文目录/æ¥æ¬èªæç« .txt
@hitobashira , the source code of utf8::all replace the readdir only for calling module which is "main" package, most of the time.
actully, if you just add one line in Path::Tiny to tell it to using utf8::all, then problem solved.
But, it will be so annoying to see find out if a 3rd party code uses readdir or so, and have to modify them in order to respect of utf8::all.
so my question is, is there a way to provide api support within utf8::all, to turn on / off utf8 support as a real "global" vision. Currently, the use utf8::all 'global', is not global actually, which is confused.
@caileiYanxiu I certainly am happy if Aladdin's magic lamp on handling UTF-8 in Perl 5 is available.