p5-sisimai icon indicating copy to clipboard operation
p5-sisimai copied to clipboard

Cannot create object from UTF-8 mail address at Sisimai::Address

Open azumakuniyuki opened this issue 9 years ago • 4 comments

https://github.com/Exim/exim/blob/master/test/mail/4223.%E0%A4%AF%E0%A4%B9%E0%A4%B2%E0%A5%8B%E0%A4%97%E0%A4%B9%E0%A4%BF%E0%A4%A8%E0%A5%8D%E0%A4%A6%E0%A5%80%E0%A4%95%E0%A5%8D%E0%A4%AF%E0%A5%8B%E0%A4%82%E0%A4%A8%E0%A4%B9%E0%A5%80%E0%A4%82%E0%A4%AC%E0%A5%8B%E0%A4%B2%E0%A4%B8%E0%A4%95%E0%A4%A4%E0%A5%87%E0%A4%B9%E0%A5%88%E0%A4%82

$VAR1 = {
          'feedbacktype' => '',
          'deliverystatus' => '5.0.0',
          'rhost' => 'the.local.host.name',
          'timestamp' => 920367873,
          'diagnostictype' => 'SMTP',
          'addresser' => 'यहलोगहिन्दीक्योंनहींबोलसकतेहैं@japanese.なぜみんな日本語を話してくれないのか.local',
          'listid' => '',
          'diagnosticcode' => 'host 127.0.0.1 [127.0.0.1]',
          'reason' => '',
          'subject' => 'test',
          'action' => 'failed',
          'lhost' => 'the.local.host.name',
          'alias' => '',
          'timezoneoffset' => '+0000',
          'recipient' => '[email protected]',
          'smtpcommand' => '',
          'softbounce' => 0,
          'messageid' => '[email protected]',
          'smtpagent' => 'Exim'
        };

Sisimai::Address->new cannot create a object from the value of "addresser" above.

azumakuniyuki avatar Nov 07 '15 05:11 azumakuniyuki

Sisimai::Address->parse() could not parsed the UTF-8 address. https://github.com/azumakuniyuki/p5-Sisimai/blob/master/lib/Sisimai/Address.pm#L106

 next if $e =~ m/[^\x20-\x7e]/;

parse() does not deal an email address which is not encoded with Punycode.

azumakuniyuki avatar Nov 07 '15 05:11 azumakuniyuki

For example, "<🐈@neko.nyaan.jp>" should be encoded "[email protected]" by Punycode. This issue will be closed soon.

azumakuniyuki avatar Nov 13 '15 07:11 azumakuniyuki

RFC 6532 extends RFC 2045 to use raw UTF-8 for address fields in message header (Punycode should not be used anyway).

Additionally, RFC 6533 defines new "utf-8-addr-xtext" and "utf-8-addr-unitext" encodings to use UTF-8 addresses in delivery reports.

hatukanezumi avatar Dec 28 '15 04:12 hatukanezumi

Thanks for the comment. I did not follow these RFCs. A short while ago, I have added 3 emails which are "Cat" in the local part of the recipient address.

azumakuniyuki avatar Dec 28 '15 05:12 azumakuniyuki