postal icon indicating copy to clipboard operation
postal copied to clipboard

Link converter truncating link

Open iammattmartin opened this issue 7 years ago • 22 comments

We used a link:

http://static.tp-link.com/TD-W9970(EU)_V2_Quick_Installation_Guide_1482286347761d.pdf

but the link converter changed to:

https://link.domain.com/xxxxx/EgoNq30c(EU)_V2_Quick_Installation_Guide_1482286347761d.pdf

Looks like () break that. Any way to ensure that the whole string is changed when there is no space?

iammattmartin avatar May 07 '17 16:05 iammattmartin

We should be able to update the regex used on the link replacer to avoid a) breaking things like this when there's an invalid character and b) support brackets in URLs. I'll take a look tomorrow.

adamcooke avatar May 07 '17 20:05 adamcooke

Just another example, although this was more where the link went than the link itself.

Original: https://xxxxx/xxxx/views.php?action=view&id=12345 Converted: https://xxxxx/xxxx/views.php?action=view&id=12345

Causes the software behind it to error because it is expecting to see the &.

iammattmartin avatar May 08 '17 20:05 iammattmartin

Was that from an HTML email or a plain text one?

adamcooke avatar May 08 '17 20:05 adamcooke

HTML.

iammattmartin avatar May 08 '17 20:05 iammattmartin

I see. The link is likely being presented with & in the HTML and is just being copied into the links table as is. I'll see if I can reproduce.

adamcooke avatar May 08 '17 20:05 adamcooke

I've been looking into this. It's actually not quite as simple as I would have hoped. We need to look at the regular expression used to detect links to avoid the situation. We want to make sure that links like this are terminated properly.

Please see our website (http://somesite.com) for full details.

If we were to allow brackets in the URL the final bracket closure would be included in the link and not included in the email. Postal has actually done exactly what it was supposed to do with that URL (even though it's not really desirable).

Unfortunately, links aren't always followed by a space which is why this occurs.

We'll need to look at this regex in a little more details to resolve this.

adamcooke avatar May 12 '17 14:05 adamcooke

Maybe allowing ( and ) in the URL string but to ensure that ) is always after a ( if included after a http*://*/

It shouldn't then fall foul of the scenario you've mentioned.

It may still miss the odd use case where a URL string only has ( or ) instead of both but that could be few and far between and then wouldn't cause any issues with a lone ) at the end due to the requirement of having a ( before it after a final slash (as it would never be part of a domain name).

iammattmartin avatar May 12 '17 17:05 iammattmartin

Any fix should be tested against the duplicate tickets referenced.

catphish avatar Jun 13 '17 15:06 catphish

I think that "non commun" characteres in url like "(" or ")" or even "]", etc. is not a priority, but a character like the query parameters separator "&" should be a priority to solve this bug because most of the urls in emails will have some kind of utm/tracking parameter...

afanjul avatar Jun 15 '17 11:06 afanjul

This also happens to links with ports.

https://server.com:1234

gets turned in to:

https://link.domain.com/ref345/sasgsf:1234

iammattmartin avatar Jul 26 '17 10:07 iammattmartin

I think I've managed to resolve this issue, see the screenshot for all the testing links mentioned so far being intact in the database.

capture

Are there any additional testable links or should I make a pull request?

willpower232 avatar Aug 05 '17 13:08 willpower232

Thanks! If you've tested this against all links mentioned in these tickets, please go ahead and make a pull request and I'll get this in. Even if it's not perfect (i'm sure someone will find a new way to break it), I'm happy to commit it as an incremental improvement.

catphish avatar Aug 05 '17 14:08 catphish

Yes please! many thanks!

afanjul avatar Aug 05 '17 15:08 afanjul

Yes please too. Thanks for your work on this @willpower232!

iammattmartin avatar Aug 05 '17 15:08 iammattmartin

I've discovered its not a complete solution but theres only one fail for plain text messages when the URL is surrounded by brackets but its a vast improvement if it matches everything else.

willpower232 avatar Aug 05 '17 15:08 willpower232

and no worries! We rely a lot on postal so getting the key issues fixed is important

willpower232 avatar Aug 05 '17 15:08 willpower232

@willpower232 I made those changed manually in our codebase and restarted postal, but the problem still persists. Our "&" are converted into "&" and thus rendered invalid. Any solution to this? Thanks.

alkanna avatar Sep 22 '20 15:09 alkanna

@alkanna can you share an example of a link that isn't working?

willpower232 avatar Sep 27 '20 12:09 willpower232

I get & converted to &. Furthermore: This kind of link ends up stripped near the end, I couldn't figure out where exactly since we do not store the original anywhere (it's a jwt token): https://xxx.xx.com/ticket/approve/eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJhcHByb3Zlcl9pZCI6NCwicmVxdWVzdG9yX2lkIjo3MTExLCJ0aWNrZXRfaWRzIjpbNjg1M10sImxpbWl0IjoiMjAyMDA5MjVUMDU1MjAwIiwiYXBwcm92ZWQiOmZhbHNlfQ.vzjOqRHzSlxe4zzOcbxIDh_nQq-W9_RoGJ_tF-QI

alkanna avatar Sep 29 '20 09:09 alkanna

The example you've posted is very long so its likely getting clipped by the database if your Postal is older than the merging of https://github.com/postalhq/postal/pull/683, you might have to manually apply converting that database field to a text type for the extra character length.

Regarding &, you can see them being converted successfully in my much older screenshot, is it possible they got encoded to & before Postal processed them?

willpower232 avatar Oct 02 '20 15:10 willpower232

Same problem. E-Mails in my Laravel app are not working properly. & is converting to &. Seems there are several open issues around this problem. Is there a known workaround for that?

rocramer avatar Oct 03 '20 20:10 rocramer

We too are experiencing the issue with & being stored in the tracking links table in the database. The issue is that the tools that send them to Postal first encode all characters within links, specifically HTML email. A link of https://test.com/test?test=1&parameter=1 will get html encoded to https://test.xx/testing?test=1&parameter=1 before sending to Postal via SMTP or API, then Postal takes that URL and stores it as such with Track Clicks being enabled. Which then the link is invalid if you were to visit a URL with & in it instead of just &.

nateh777 avatar Feb 08 '23 18:02 nateh777