browsertrix icon indicating copy to clipboard operation
browsertrix copied to clipboard

international chars breaking links

Open thsm-kb opened this issue 9 months ago • 0 comments

Browsertrix Cloud Version

v1.8.0-b6f8c96

What did you expect to happen? What happened instead?

Harvesting http://midtfjordradio.dk/ produces odd results regarding link with danish char ø.

http://midtfjordradio.dk/St%C3%B8t.html

finished crawl here: https://beta.browsertrix.cloud/orgs/netarkivet-det-kgl-bibliotek/items/crawl/manual-20231115100719-880a54cf-658#replay

0d2ba4ed-f2af-4d06-970d-454bafe9c148

Step-by-step reproduction instructions

crawl http://midtfjordradio.dk/ specific the link http://midtfjordradio.dk/St%C3%B8t.html

Additional details

No response

thsm-kb avatar Nov 16 '23 10:11 thsm-kb