perl6-lwp-simple
perl6-lwp-simple copied to clipboard
Allow for moar control over character encoding.
As of now rakudo supports limited set of encodings. However some websites still use encodings like:
$ curl -s -I http://www.google.pl/ | grep charset=
Content-Type: text/html; charset=ISO-8859-2
excerpt from https://github.com/rakudo/rakudo/blob/nom/src/core/Rakudo/Internals.pm:
my $encodings := nqp::hash(
# fast mapping for identicals
'utf8', 'utf8',
'utf16', 'utf16',
'utf32', 'utf32',
'ascii', 'ascii',
'iso-8859-1', 'iso-8859-1',
'windows-1252', 'windows-1252',
# with dash
'utf-8', 'utf8',
'utf-16', 'utf16',
'utf-32', 'utf32',
# according to http://de.wikipedia.org/wiki/ISO-8859-1
'iso_8859-1:1987', 'iso-8859-1',
'iso_8859-1', 'iso-8859-1',
'iso-ir-100', 'iso-8859-1',
'latin1', 'iso-8859-1',
'latin-1', 'iso-8859-1',
'csisolatin1', 'iso-8859-1',
'l1', 'iso-8859-1',
'ibm819', 'iso-8859-1',
'cp819', 'iso-8859-1',
);
It may be useful to:
- tell LWP::Simple not to tamper with encoding (e.g. if you want to pipe the output to other process or print response body to a terminal)
- force encoding (if you want to further process the response body as a string in your p6 script), so for example in case if the encoding type isn't set and you do know which one it is.
It looks like the intent of this was added in the eb98a2c1