URLextractor icon indicating copy to clipboard operation
URLextractor copied to clipboard

[FALSE ALARM, nevermind] Bug report: single quote in the URL is treated as an end-of-URL.

Open GhbSmwc opened this issue 4 years ago • 1 comments

Page tested: https://www.uchinokomato.me/chara/show/44405

When it tries to obtain this URL:

https://s3-ap-northeast-1.amazonaws.com/uchinoko/chara_images/pictures/000/105/454/original/%E3%83%AA%E3%83%B3%E3%82%AB%E3%81%A1%E3%82%83%E3%82%93%2889'MO%E3%81%95%E3%82%93%E3%81%8B%E3%82%89%E3%81%AE%E9%A0%82%E3%81%8D%E7%89%A9%29.png?1459753344

(Note the ' symbol INSIDE the URL and is part of the string.) It extracts this instead:

https://s3-ap-northeast-1.amazonaws.com/uchinoko/chara_images/pictures/000/105/454/original/%E3%83%AA%E3%83%B3%E3%82%AB%E3%81%A1%E3%82%83%E3%82%93%2889

What happened is that the script got confused thinking that the single quote (or apostrophe) inside the URL is the end of the string, but wasn't. Here is the HTML code that the extractor script is seeing:

<a data-lightbox="gallery" data-title="Uploaded at 2016-4-4 7:01
89'MO様に描いていただきました" href="https://s3-ap-northeast-1.amazonaws.com/uchinoko/chara_images/pictures/000/105/454/original/%E3%83%AA%E3%83%B3%E3%82%AB%E3%81%A1%E3%82%83%E3%82%93%2889'MO%E3%81%95%E3%82%93%E3%81%8B%E3%82%89%E3%81%AE%E9%A0%82%E3%81%8D%E7%89%A9%29.png?1459753344"><img src="https://s3-ap-northeast-1.amazonaws.com/uchinoko/chara_images/pictures/000/105/454/medium/%E3%83%AA%E3%83%B3%E3%82%AB%E3%81%A1%E3%82%83%E3%82%93%2889'MO%E3%81%95%E3%82%93%E3%81%8B%E3%82%89%E3%81%AE%E9%A0%82%E3%81%8D%E7%89%A9%29.png?1459753344" alt="%e3%83%aa%e3%83%b3%e3%82%ab%e3%81%a1%e3%82%83%e3%82%93%2889'mo%e3%81%95%e3%82%93%e3%81%8b%e3%82%89%e3%81%ae%e9%a0%82%e3%81%8d%e7%89%a9%29" style="height: 234.283px;"></a>

Note the URL is wrapped in double quotes.

Using a quote and double quote together can be used to have strings inside a quote (such as in javascript:

onchange="Function('Arg1', 'Arg2'); Calculate()"

)

Also " cannot be used in a filename (reserved character).

GhbSmwc avatar Jul 01 '20 20:07 GhbSmwc

WAIT! FALSE ALARM!

I was using my NP++ macros which inadvertenly removes these single quotes when it shouldn't

GhbSmwc avatar Jul 01 '20 20:07 GhbSmwc