Template2
Template2 copied to clipboard
use locale in template::filters [rt.cpan.org #119992]
Migrated from rt.cpan.org#119992 (status was 'new')
Requestors:
From [email protected] on 2017-01-26 08:02:45:
Seeing warnings like
Wide character (U+434) in substitution (s///) at /usr/local/lib/x86_64-linux-gnu/perl/5.22.1/Template/Filters.pm line 62
Perldiag tells:
====
Wide character in %s
(S utf8) Perl met a wide character (>255) when it wasn't expecting one. This warning is by default on for I/O (like print). The easiest way to quiet this warning is simply to add the :utf8 layer to the output, e.g. binmode STDOUT, ':utf8' . Another way to turn off the warning is to add no warnings 'utf8'; but that is often closer to cheating. In general, you are supposed to explicitly mark the filehandle with an encoding, see open and binmode.
====
If I remove 'use locale' from T::F, warning gone.
So why is it there and how to use it with unicode strings?
Hi I don't have any example code here but the issue is that you're not doing the following, adjusting STDOUT to whatever your file handle is.
binmode(STDOUT, ":utf8");
Please re-open this issue, it is still actual.
No, setting binmode to utf8 binmode(STDOUT, ":utf8");
does not help.
But removing the use locale
line from T::F code does help.
Since locales are more a form of nationalization than of internationalization, the use of locales may interact oddly with >Unicode (c) O'Reilly & Associates "Programming Perl"
docstore.mik.ua/orelly/perl2/prog/ch31_14.htm
I will provide PoC if needed
@toddr
note: the use locale
was added as part of d56b9a8c43c1876ac655d0d87efc9c459ab5dc65 change
from the changelog:
* Added "use locale" to Template::Filters to enable locale-specific
filters.
It seems that locale was added to fix: http://rt.cpan.org/Ticket/Display.html?id=9094 http://rt.cpan.org/Ticket/Display.html?id=5695
@evengar2008 could you please provide steps to reproduce the issue you noticed? thanks
@evengar2008 can you confirm that this is fixing your issue
diff --git a/lib/Template/Filters.pm b/lib/Template/Filters.pm
index fdd82b85..a0d1de86 100644
--- a/lib/Template/Filters.pm
+++ b/lib/Template/Filters.pm
@@ -57,9 +57,9 @@ our $FILTERS = {
'ucfirst' => sub { ucfirst $_[0] },
'lcfirst' => sub { lcfirst $_[0] },
'stderr' => sub { print STDERR @_; return '' },
- 'trim' => sub { for ($_[0]) { s/^\s+//; s/\s+$// }; $_[0] },
+ 'trim' => sub { use bytes; for ($_[0]) { s/^\s+//; s/\s+$// }; $_[0] },
'null' => sub { return '' },
- 'collapse' => sub { for ($_[0]) { s/^\s+//; s/\s+$//; s/\s+/ /g };
+ 'collapse' => sub { use bytes; for ($_[0]) { s/^\s+//; s/\s+$//; s/\s+/ /g };
$_[0] },
# dynamic filters
@evengar2008 can you confirm that this is fixing your issue
diff --git a/lib/Template/Filters.pm b/lib/Template/Filters.pm index fdd82b85..a0d1de86 100644 --- a/lib/Template/Filters.pm +++ b/lib/Template/Filters.pm @@ -57,9 +57,9 @@ our $FILTERS = { 'ucfirst' => sub { ucfirst $_[0] }, 'lcfirst' => sub { lcfirst $_[0] }, 'stderr' => sub { print STDERR @_; return '' }, - 'trim' => sub { for ($_[0]) { s/^\s+//; s/\s+$// }; $_[0] }, + 'trim' => sub { use bytes; for ($_[0]) { s/^\s+//; s/\s+$// }; $_[0] }, 'null' => sub { return '' }, - 'collapse' => sub { for ($_[0]) { s/^\s+//; s/\s+$//; s/\s+/ /g }; + 'collapse' => sub { use bytes; for ($_[0]) { s/^\s+//; s/\s+$//; s/\s+/ /g }; $_[0] }, # dynamic filters
@atoomic
The doc
https://perldoc.perl.org/bytes
says that "Use of this module for anything other than debugging purposes is strongly discouraged". It does not work well with Unicode and trim
with use bytes
pragma won't work correctly, e.g. it won't trim unicode spaces.
Even though this solution remove some warnings, it is unreliable and not fully compatible with Unicode.
Working in PoC now. Not reproducible under pure TT call, only within Catalyst::View::TT
processing
@atoomic here's the PoC
test.pl
use POSIX qw( setlocale LC_ALL );
use warnings;
use strict;
use utf8;
setlocale( LC_ALL, 'POSIX' );
use Template::Filters;
use Template;
my $tt = Template->new({
INCLUDE_PATH => 'templates/regru/',
FILTERS => $Template::Filters::FILTERS,
INTERPOLATE => 1,
ENCODING => 'UTF-8',
}) || die "$Template::ERROR\n";
my $output;
my $text = 'Съешь ещё этих мягких французских булок, да выпей же чаю ';
$tt->process( 'test.inc', { test => $text }, \$output );
test.inc
<p>[% test | trim %]</p>
<p>[% test | ucfirst %]</p>
It seems that if code that sets its own setlocale
and imports T::F is not compatible with use locale
in T::F.
If I remove the use locale
pragma from T::F, everything works well
Just re-rechecked the solution with use bytes
in T::F::Filters::trim(). It does not work, warning about wide chars in substitution still appear
The issue is that if we set somewhere else locale to POSIX, then in T::F all string operations with unicode will be affected because of the use locale
pragma enabled within T::F
.
This can be fixed either by removing use locale
pragma from T::F or by switching locale to UTF8 locale in outer scope where T::F is imported.