Missing dependencies in list plus ERROR due to unsanitized special characters
Hi,
you're missing a dependency in your list which is not available by default. Why don't you instead include a requirements.txt file as per Python standards?
lxml
Also, you need to sanitize the URLs in order to avoid errors with international and special characters. It's really easy:
sanitized_string = htmlentities(unsanitized_string)
You should just append the sanitized URL to the queue, I imagine.
Great utility - thanks
I have seen the exact same thing with lxml and UTF8 web pages For example: http://www.themadhowes.org.uk/kpop/subtitles.html
@cooperdk , I'm not familiar with python standards, would you mind creating PR with those changes?