monolith
monolith copied to clipboard
Error following redirect for url
Hi!
I get
❯ monolith -v -f -a -I -j -o bertil_persson.html https://www.kt-kuriren.se/2023/10/24/bertil-asa-persson-fran-degerfors-har-avlidit-han-blev-88-ar-346a2/
https://www.kt-kuriren.se/2023/10/24/bertil-asa-persson-fran-degerfors-har-avlidit-han-blev-88-ar-346a2/ (error following redirect for url (https://www.kt-kuriren.se/2023/10/24/bertil-asa-persson-fran-degerfors-har-avlidit-han-blev-88-ar-346a2/): too many redirects)
Could not retrieve target document
with monolith 2.8.1
Can I fix it with appropiate options, is it a bug, or just not possible (with just monolith)?
Hello. The URL may be protected by something like cloudflare, against non-browser access. You could try to pipe that HTML through a headless browser instance (as described in the readme file for JS-heavy websites).
Could be a CDN protection thing. Just tried with monolight 2.8.1 and got the page.
Got the page with a bunch of cdn urls:
monolith -v -f -a -I -j -o bertil_persson.html https://www.kt-kuriren.se/2023/10/24/bertil-asa-persson-fran-degerfors-har-avlidit-han-blev-88-ar-346a2/
https://www.kt-kuriren.se/2023/10/24/bertil-asa-persson-fran-degerfors-har-avlidit-han-blev-88-ar-346a2/
https://cdn.production.nwtmedia.se/public/favicons/ktkuriren/android-chrome-192x192.png
https://cdn.production.nwtmedia.se/public/favicons/ktkuriren/favicon-32x32.png
https://cdn.production.nwtmedia.se/_next/static/css/389dd3cca0a1209e.css
https://cdn.production.nwtmedia.se/_next/static/media/SourceSansPro-Light.a212613e.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSansPro-LightItalic.556bd7db.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSansPro-Regular.abb65e07.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSansPro-Italic.860dcbd2.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSansPro-SemiBold.64fe82ba.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSansPro-SemiBoldItalic.9cce4c1e.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSansPro-Bold.997a01f6.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSerifPro-Regular.8cb541b2.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSerifPro-Italic.052df52b.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSerifPro-Bold.9c5d4ba3.ttf
https://cdn.production.nwtmedia.se/_next/static/media/SourceSerifPro-BoldItalic.d5e04878.ttf
https://cdn.production.nwtmedia.se/_next/static/css/775f6929e92742f2.css
https://cdn.production.nwtmedia.se/_next/static/media/lg.955a4bcf.woff2
https://cdn.production.nwtmedia.se/_next/static/media/lg.dc565ab5.ttf
https://cdn.production.nwtmedia.se/_next/static/media/lg.c950f0b5.woff
https://cdn.production.nwtmedia.se/_next/static/media/lg.a5ca0178.svg#lg
https://cdn.production.nwtmedia.se/_next/static/media/loading.49ca460c.gif
https://imengine.public.nwt.infomaker.io/image.php?type=preview&uuid=75bebe20-57f4-5e4c-9705-f5a6c2f12298&function=cover&width=128&height=128&format=png&q=60
https://imengine.public.nwt.infomaker.io/image.php?type=preview&uuid=75bebe20-57f4-5e4c-9705-f5a6c2f12298&function=cover&width=64&height=64&format=png&q=60
https://imengine.public.nwt.infomaker.io/image.php?type=preview&uuid=75bebe20-57f4-5e4c-9705-f5a6c2f12298&function=cover&width=128&height=128&format=png&q=60 (from cache)