text-vcard icon indicating copy to clipboard operation
text-vcard copied to clipboard

Hardcoded "\x0D\x0A" fails for parsing vCard on Windows

Open tmhall opened this issue 10 years ago • 8 comments

I was attempt to help a user on SO: http://stackoverflow.com/q/22790520/1733163

However, I was unable to install this module and it's dependencies without errors on Strawberry Perl 5.18.2. In the end I forced install in order to debug the module.

I traced one issue to Line 315 of Text::vCard::Addressbook my @lines = split "\x0D\x0A", $text;

The above will file to split a file on Windows because \x0D is stripped by perlio in :crlf mode. The only way to bypass this issue is to open import the file in binmode and pass it to he constructor as a string. The following straights:

use strict;
use warnings;
use utf8;

use vCard;

# create the object
my $vcard = vCard->new;

binmode(\*DATA);
my $string = do { local $/; <DATA> };

$vcard->load_string($string);

print $vcard->as_string();

__DATA__
BEGIN:VCARD
VERSION:3.0
N;LANGUAGE=de:name;first
FN:first name
ORG:company
TEL;TYPE=work:+49 (0000) 123456
ADR;TYPE=work:;;Strasse 1;Ortschaft arbeit;;56789;Deutschland
ADR;TYPE=home;PREF:;;Strasse 2;Ortschaft 4;;12345;Deutschland
EMAIL:[email protected]
URL:www.a.de
END:VCARD

Without binmode, the output is:

BEGIN VCARD without matching END at C:/strawberry/perl/site/lib/Text/vFile/asData.pm line 137,

With binmode, the output is:

BEGIN:VCARD
VERSION:4.0
END:VCARD

Obviously there are still more issues to resolve, but I at least wanted to submit this. If I have time to follow up with this issue, or even create a full patch, I will submit those later.

tmhall avatar Apr 14 '14 23:04 tmhall

Hi @tmhall! Thanks for reporting this. It makes me aware of a few issues.

The vcard RFC requires every line to end with a \r\n. We should have a more helpful error if the string is not properly formatted. Do you still have a problem if you use $vcard->load_file()?

kablamo avatar Apr 15 '14 01:04 kablamo

Also load_string() does not do what the documentation says it does. Here is code that should behave the way you want:

my $new_vcard = $vcard->load_string($string);
$new_vcard->as_string();

This is not intuitive behavior and someone should look into fixing that.

kablamo avatar Apr 15 '14 01:04 kablamo

Ok, I submitted 2 pull requests which should address a few of the issues here. I still need to create a better error message for strings that are missing the \r\n pattern. But I'm heading off to sleep now.

kablamo avatar Apr 15 '14 03:04 kablamo

@kablamo boom: http://www.cpantesters.org/cpan/report/6e1d7369-6c10-1014-b8e4-5db800d6f3f7

ranguard avatar Apr 28 '14 19:04 ranguard

@kablamo less boom, but still boom http://www.cpantesters.org/cpan/report/1d3fc6ca-6bfa-1014-9538-937384992293

ranguard avatar May 02 '14 14:05 ranguard

@kablamo different boom! http://www.cpantesters.org/cpan/report/d9bfc4e4-7489-1014-b7e9-e9c63567e6ce

ranguard avatar May 14 '14 07:05 ranguard

@kablamo Getting there! - but now UTF8 issue for Windoz: http://www.cpantesters.org/cpan/report/33e1ea25-6bfa-1014-afed-754eb9e98ec4

ranguard avatar Jun 15 '14 17:06 ranguard

The hardcoded CRLF problem is not limited to Windows. I just encountered it with vcards downloaded from an official phone directory.

I would suggest to follow Jon Postel's "Robustness Principle" from RFC 761:

be conservative in what you do, be liberal in what you accept

and accept both CRLF and LF line endings.

My quick fix was this, in case someone finds it useful:

$ diff -u /usr/share/perl5/Text/vFile/asData.pm.orig /usr/share/perl5/Text/vFile/asData.pm
--- /usr/share/perl5/Text/vFile/asData.pm.orig	2013-01-28 13:32:13.000000000 +0100
+++ /usr/share/perl5/Text/vFile/asData.pm	2020-09-30 16:15:51.850221688 +0200
@@ -27,9 +27,9 @@
 sub _unwrap_lines {
     my $self = shift;
     my @lines;
-    for (@_) {
+    for (map {split /\R/} @_) {
         my $line = $_; # $_ may be readonly
-        $line =~ s{[\r\n]+$}{}; # lines SHOULD end CRLF
+        $line =~ s{\R+$}{}; # lines SHOULD end CRLF (but let's not be so strict)
         if ($line =~ /^[ \t](.*)/) { # Continuation line (RFC Sect. 4.1)
             die "Continuation line, but no preceding line" unless @lines;
             $lines[-1] .= $1;

mivk avatar Sep 30 '20 14:09 mivk