email-reply-parser icon indicating copy to clipboard operation
email-reply-parser copied to clipboard

Gmail reply fails to parse

Open sergioisidoro opened this issue 9 years ago • 8 comments

Result of parsing a message

This is the third message

2014-12-11 23:51 GMT+02:00 ******** *********@gmail.com:

Last line is part of the reply header

sergioisidoro avatar Dec 11 '14 22:12 sergioisidoro

Mind posting the full input message? (Feel free to censor the private bits.)

bryanhelmig avatar Dec 11 '14 23:12 bryanhelmig

IME-Version: 1.0
Received: by 10.112.49.111 with HTTP; Thu, 11 Dec 2014 13:54:00 -0800 (PST)
In-Reply-To: <CAO0FGMcY=uDt-0h02zP6b_a7WVqJ2zJe48MqOxBLKERc2guL2Q@mail.gmail.com>
References: <CAO0FGMdUe2Y=4bnhaWAsZDH92FGdsd6TupcbVgFhFw1YHzAh5Q@mail.gmail.com>
    <CAO0FGMcY=uDt-0h02zP6b_a7WVqJ2zJe48MqOxBLKERc2guL2Q@mail.gmail.com>
Date: Thu, 11 Dec 2014 23:54:00 +0200
Delivered-To: *******@gmail.com
Message-ID: <CAO0FGMfwebGfaw7GmXCdnWY2L0ep3VxK6JY8MhmuO6CpC275Cg@mail.gmail.com>
Subject: Re: [TAG1] [TAG2][tag3]This is an email thread
From: =?UTF-8?Q?S=C3=A9rgio_Isidoro?= <******@gmail.com>
To: Sergio Isidoro <*****@gmail.com>
Content-Type: multipart/alternative; boundary=001a11c36d045c46d00509f7d0f1

--001a11c36d045c46d00509f7d0f1
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

This is the third message

2014-12-11 23:51 GMT+02:00 S=C3=A9rgio Isidoro <******@gmail.com>:

> This is the second message
>
> 2014-12-11 23:51 GMT+02:00 S=C3=A9rgio Isidoro <******@gmail.com>:
>
>> This is the first message
>>
>> --
>> S=C3=A9rgio Miguel Adelino Isidoro
>> Instituto Superior T=C3=A9cnico - ULisboa
>> School of Science - Aalto University
>>
>
>
>
> --
> S=C3=A9rgio Miguel Adelino Isidoro
> Instituto Superior T=C3=A9cnico - ULisboa
> School of Science - Aalto University
>



--=20
S=C3=A9rgio Miguel Adelino Isidoro
Instituto Superior T=C3=A9cnico - ULisboa
School of Science - Aalto University

--001a11c36d045c46d00509f7d0f1
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">This is the third message</div><div class=3D"gmail_extra">=
<br><div class=3D"gmail_quote">2014-12-11 23:51 GMT+02:00 S=C3=A9rgio Isido=
ro <span dir=3D"ltr">&lt;<a href=3D"mailto:******@gmail.com" target=3D"=
_blank">******@gmail.com</a>&gt;</span>:<br><blockquote class=3D"gmail_=
quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1=
ex"><div dir=3D"ltr">This is the second message</div><div class=3D"HOEnZb">=
<div class=3D"h5"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote"=
>2014-12-11 23:51 GMT+02:00 S=C3=A9rgio Isidoro <span dir=3D"ltr">&lt;<a hr=
ef=3D"mailto:******@gmail.com" target=3D"_blank">******@gmail.com</=
a>&gt;</span>:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr">This is t=
he first message<span><font color=3D"#888888"><br clear=3D"all"><div><br></=
div>-- <br><div><div dir=3D"ltr">S=C3=A9rgio Miguel Adelino Isidoro=C2=A0<d=
iv>Instituto Superior T=C3=A9cnico - ULisboa<br></div><div>School of Scienc=
e - Aalto University<br></div></div></div>
</font></span></div>
</blockquote></div><br><br clear=3D"all"><div><br></div>-- <br><div><div di=
r=3D"ltr">S=C3=A9rgio Miguel Adelino Isidoro=C2=A0<div>Instituto Superior T=
=C3=A9cnico - ULisboa<br></div><div>School of Science - Aalto University<br=
></div></div></div>
</div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div class=3D"gmail_signature"><div dir=3D"ltr">S=C3=A9rgio Miguel Adelino =
Isidoro=C2=A0<div>Instituto Superior T=C3=A9cnico - ULisboa<br></div><div>S=
chool of Science - Aalto University<br></div></div></div>
</div>

--001a11c36d045c46d00509f7d0f1-

sergioisidoro avatar Dec 11 '14 23:12 sergioisidoro

Thanks!

bryanhelmig avatar Dec 11 '14 23:12 bryanhelmig

Same issue here i guess

------=_NextPart_001_0014_01D015FA.14DAB830
Content-Type: multipart/alternative;
    boundary="----=_NextPart_002_0015_01D015FA.14DAB830"


------=_NextPart_002_0015_01D015FA.14DAB830
Content-Type: text/plain;
    charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

1.       ALAQUES =3D> CATEGORIZAR CHAMADO P/ ATENDIMENTO N=CDVEL 1 =
(ATENDENTE
JANA)

2.       JANA =3D> ABRIR CHAMADO FILHO P/ ATENDIMENTO N=CDVEL 2

3.       ALAQUES =3D> FECHAR CHAMADO FILHO E ESCREVER AN=C1LISE E =
ORIENTA=C7=D5ES

4.       JANA =3D> ABRIR CHAMADO FILHO P/ F=C1BRICA DE SOFTWARE =
(ATENDENTE
MAGUIO)

5.       MAGUIO =3D> ENCAMINHAR CHAMADO FILHO PARA F=C1BRICA DE TESTES
(ATENDENTE RODRIGO)

6.       RODRIGO =3D> FECHAR CHAMADO FILHO

7.       JANA =3D> FECHAR CHAMADO PAI



=20


------=_NextPart_002_0015_01D015FA.14DAB830
Content-Type: text/html;
    charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html xmlns:v=3D"urn:schemas-microsoft-com:vml" =
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta =
http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1"><meta name=3DGenerator content=3D"Microsoft Word =
15 (filtered medium)"><!--[if !mso]><style>v\:* =
{behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
    {font-family:"Cambria Math";
    panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
    {font-family:Calibri;
    panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
    {margin:0cm;
    margin-bottom:.0001pt;
    font-size:11.0pt;
    font-family:"Calibri","sans-serif";
    mso-fareast-language:EN-US;}
a:link, span.MsoHyperlink
    {mso-style-priority:99;
    color:#0563C1;
    text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
    {mso-style-priority:99;
    color:#954F72;
    text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
    {mso-style-priority:34;
    margin-top:0cm;
    margin-right:0cm;
    margin-bottom:0cm;
    margin-left:36.0pt;
    margin-bottom:.0001pt;
    font-size:11.0pt;
    font-family:"Calibri","sans-serif";
    mso-fareast-language:EN-US;}
span.EstiloDeEmail17
    {mso-style-type:personal-compose;
    font-family:"Calibri","sans-serif";
    color:windowtext;}
.MsoChpDefault
    {mso-style-type:export-only;
    font-family:"Calibri","sans-serif";
    mso-fareast-language:EN-US;}
@page WordSection1
    {size:612.0pt 792.0pt;
    margin:70.85pt 3.0cm 70.85pt 3.0cm;}
div.WordSection1
    {page:WordSection1;}
/* List Definitions */
@list l0
    {mso-list-id:1997103091;
    mso-list-type:hybrid;
    mso-list-template-ids:1780004568 68550671 68550681 68550683 68550671 =
68550681 68550683 68550671 68550681 68550683;}
@list l0:level1
    {mso-level-tab-stop:none;
    mso-level-number-position:left;
    text-indent:-18.0pt;}
@list l0:level2
    {mso-level-number-format:alpha-lower;
    mso-level-tab-stop:none;
    mso-level-number-position:left;
    text-indent:-18.0pt;}
@list l0:level3
    {mso-level-number-format:roman-lower;
    mso-level-tab-stop:none;
    mso-level-number-position:right;
    text-indent:-9.0pt;}
@list l0:level4
    {mso-level-tab-stop:none;
    mso-level-number-position:left;
    text-indent:-18.0pt;}
@list l0:level5
    {mso-level-number-format:alpha-lower;
    mso-level-tab-stop:none;
    mso-level-number-position:left;
    text-indent:-18.0pt;}
@list l0:level6
    {mso-level-number-format:roman-lower;
    mso-level-tab-stop:none;
    mso-level-number-position:right;
    text-indent:-9.0pt;}
@list l0:level7
    {mso-level-tab-stop:none;
    mso-level-number-position:left;
    text-indent:-18.0pt;}
@list l0:level8
    {mso-level-number-format:alpha-lower;
    mso-level-tab-stop:none;
    mso-level-number-position:left;
    text-indent:-18.0pt;}
@list l0:level9
    {mso-level-number-format:roman-lower;
    mso-level-tab-stop:none;
    mso-level-number-position:right;
    text-indent:-9.0pt;}
ol
    {margin-bottom:0cm;}
ul
    {margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext=3D"edit">
<o:idmap v:ext=3D"edit" data=3D"1" />
</o:shapelayout></xml><![endif]--></head><body lang=3DPT-BR =
link=3D"#0563C1" vlink=3D"#954F72"><div class=3DWordSection1><p =
class=3DMsoListParagraph style=3D'text-indent:-18.0pt;mso-list:l0 level1 =
lfo1'><![if !supportLists]><span style=3D'mso-list:Ignore'>1.<span =
style=3D'font:7.0pt "Times New =
Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
</span></span><![endif]>ALAQUES =3D&gt; CATEGORIZAR CHAMADO P/ =
ATENDIMENTO N=CDVEL 1 (ATENDENTE JANA)<o:p></o:p></p><p =
class=3DMsoListParagraph style=3D'text-indent:-18.0pt;mso-list:l0 level1 =
lfo1'><![if !supportLists]><span style=3D'mso-list:Ignore'>2.<span =
style=3D'font:7.0pt "Times New =
Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
</span></span><![endif]>JANA =3D&gt; ABRIR CHAMADO FILHO P/ ATENDIMENTO =
N=CDVEL 2<o:p></o:p></p><p class=3DMsoListParagraph =
style=3D'text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if =
!supportLists]><span style=3D'mso-list:Ignore'>3.<span =
style=3D'font:7.0pt "Times New =
Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
</span></span><![endif]>ALAQUES =3D&gt; FECHAR CHAMADO FILHO E ESCREVER =
AN=C1LISE E ORIENTA=C7=D5ES<o:p></o:p></p><p class=3DMsoListParagraph =
style=3D'text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if =
!supportLists]><span style=3D'mso-list:Ignore'>4.<span =
style=3D'font:7.0pt "Times New =
Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
</span></span><![endif]>JANA =3D&gt; ABRIR CHAMADO FILHO P/ F=C1BRICA DE =
SOFTWARE (ATENDENTE MAGUIO)<o:p></o:p></p><p class=3DMsoListParagraph =
style=3D'text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if =
!supportLists]><span style=3D'mso-list:Ignore'>5.<span =
style=3D'font:7.0pt "Times New =
Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
</span></span><![endif]>MAGUIO =3D&gt; ENCAMINHAR CHAMADO FILHO PARA =
F=C1BRICA DE TESTES (ATENDENTE RODRIGO)<o:p></o:p></p><p =
class=3DMsoListParagraph style=3D'text-indent:-18.0pt;mso-list:l0 level1 =
lfo1'><![if !supportLists]><span style=3D'mso-list:Ignore'>6.<span =
style=3D'font:7.0pt "Times New =
Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
</span></span><![endif]>RODRIGO =3D&gt; FECHAR CHAMADO =
FILHO<o:p></o:p></p><p class=3DMsoListParagraph =
style=3D'text-indent:-18.0pt;mso-list:l0 level1 lfo1'><![if =
!supportLists]><span style=3D'mso-list:Ignore'>7.<span =
style=3D'font:7.0pt "Times New =
Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; =
</span></span><![endif]>JANA =3D&gt; FECHAR CHAMADO PAI<o:p></o:p></p><p =
class=3DMsoNormal><span style=3D'mso-fareast-language:PT-BR'><img =
width=3D500 height=3D114 id=3D"Imagem_x0020_1" =
src=3D"cid:[email protected]" =
alt=3D"Assinatura_Andr=E9-Leite"></span><span =
style=3D'mso-fareast-language:PT-BR'><o:p></o:p></span></p><p =
class=3DMsoNormal><o:p>&nbsp;</o:p></p></div></body></html>
------=_NextPart_002_0015_01D015FA.14DAB830--

------=_NextPart_001_0014_01D015FA.14DAB830
Content-Type: image/png;
    name="image001.png"
Content-Transfer-Encoding: base64
Content-ID: <[email protected]>

iVBORw0KGgoAAAANSUhEUgAAAfQAAAByCAIAAAAqKrSsAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ
bWFnZVJlYWR5ccllPAAAA4RpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdp
bj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6
eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuNS1jMDE0IDc5LjE1
MTQ4MSwgMjAxMy8wMy8xMy0xMjowOToxNSAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJo
dHRwOi8vd3d3LnczLmxfmw9IMB58sYmTw9kplomN6SbJjMImo
uYnQo8C+ul9A+MaDLwEROBtP+IP6GJeZdTURlH9BG11w2Kq2psYKuG4yoVIltLLc2zASpgZz2e22
mOiYmLg4ig/I8MLLRS42i6W5saFThQpBxCYkdgm32q3Wpob6jnbMPh9FUzHxiWEX6jyNJ72mWoKk
cN6kwMcytMIgjB0SdgL1NTUWizm4Z4A/lUYTHRPbbVi1xWI5dSUNAFRiSkoX1gSn3VZdVUkQp9rQ
yzI6vSEyMqrL97Ah2LIhc2O0ekNEt2Fhr2ptdXVwiUIQNhWSYfSRkfrORfUwoLqi3OVyofF46RgO
aoiI0hkMnYaB7qg4wDJuHMMiOF0hjB/6/wIMAG5nD8n/Hb8iAAAAAElFTkSuQmCC
==================================================------=_NextPart_001_0014_01D015FA.14DAB830--

1.       ALAQUES => CATEGORIZAR CHAMADO P/ ATENDIMENTO NÍVEL 1 (ATENDENTE
JANA)

2.       JANA => ABRIR CHAMADO FILHO P/ ATENDIMENTO NÍVEL 2

3.       ALAQUES => FECHAR CHAMADO FILHO E ESCREVER ANÁLISE E ORIENTAÇÕES

4.       JANA => ABRIR CHAMADO FILHO P/ FÁBRICA DE SOFTWARE (ATENDENTE
MAGUIO)

5.       MAGUIO => ENCAMINHAR CHAMADO FILHO PARA FÁBRICA DE TESTES
(ATENDENTE RODRIGO)

6.       RODRIGO => FECHAR CHAMADO FILHO

7.       JANA => FECHAR CHAMADO PAI

original-gmail

muriloreinert avatar Dec 15 '14 13:12 muriloreinert

Has support for this project stopped? I am also unable to parse anything

graingerkid avatar May 14 '15 08:05 graingerkid

@sergioisidoro For me, the timestamp always begins with

On Tue, Nov 1, 2016 at 6:12 PM, name <[email protected]> wrote:

which fits the test case specified here.

@bryanhelmig I wonder if this issue is outdated and can be closed, or is specific to certain email clients. Any ideas?

sungwoncho avatar Nov 16 '16 06:11 sungwoncho

I think its client specific. I would be willing to accept a PR to handle this type of reply line, but this is dated and I would say close this and we can reopen if @sergioisidoro https://github.com/sergioisidoro comes back

On Tue, Nov 15, 2016 at 11:19 PM Sung Won Cho [email protected] wrote:

@sergioisidoro https://github.com/sergioisidoro For me, the timestamp always begins with

On Tue, Nov 1, 2016 at 6:12 PM, name [email protected] wrote:

which fits the test case specified here https://github.com/zapier/email-reply-parser/blob/master/test/emails/email_gmail.txt#L5 .

@bryanhelmig https://github.com/bryanhelmig I wonder if this issue is outdated and can be closed, or is specific to certain email clients. Any ideas?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/zapier/email-reply-parser/issues/15#issuecomment-260863955, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP2p8XgnRLmflp_1fgCQUwvbLv_MDA7ks5q-qBkgaJpZM4DHboe .

kageurufu avatar Nov 16 '16 16:11 kageurufu

I'm using a Google Mail account and I get a parsing error on the third email (second reply). Here's my test:

Email 1

This is just a test.

Parsed reply (correct):

This is just a test.

Email 2 (Reply 1)

And this is a test response.

On Fri, Jan 5, 2018 at 12:34 PM, Adam Taylor <[email protected]>
wrote:

> This is just a test.
>

Parsed reply (correct):

And this is a test response.

Email 3 (Reply 2)

And this is a response to the test response.

On Fri, Jan 5, 2018 at 12:34 PM, Adam Taylor <[email protected]>
wrote:

> And this is a test response.
>
> On Fri, Jan 5, 2018 at 12:34 PM, Adam Taylor <[email protected]>
> wrote:
>
>> This is just a test.
>>
>
>

Parsed reply (incorrect):

And this is a response to the test response.

On Fri, Jan 5, 2018 at 12:34 PM, Adam Taylor <[email protected]>
wrote:

It's worth noting that GitHub's Ruby version has the same problem.

ataylor32 avatar Jan 05 '18 20:01 ataylor32