moin icon indicating copy to clipboard operation
moin copied to clipboard

remaining bad links in help

Open bylsmad opened this issue 2 years ago • 2 comments

as these are fixed, remove the entry from KNOWN_ISSUES in cli/_tests/test_scrapy_crawl

INFO moin.cli._tests.test_scrapy_crawl:test_scrapy_crawl.py:84 expected {200} got 404 for url="https://www.picklepartysalon.com:8081/users/Home", from_url="https://www.picklepartysalon.com:8081/devwiki/help-en/html", from_text='/users/Home', from_type='href', response_code=404, response_exc="<twisted.python.failure.Failure scrapy.spidermiddlewares.httperror.HttpError: Ignoring non-200 response>" INFO moin.cli._tests.test_scrapy_crawl:test_scrapy_crawl.py:92 known issue url="https://www.picklepartysalon.com:8081/+get/help-common/logo.png", from_url="https://www.picklepartysalon.com:8081/devwiki/help-en/markdown", from_text='', from_type='src', response_code=404, response_exc="<twisted.python.failure.Failure scrapy.spidermiddlewares.httperror.HttpError: Ignoring non-200 response>" INFO moin.cli._tests.test_scrapy_crawl:test_scrapy_crawl.py:92 known issue url="http://localhost:8080/+serve/ckeditor/plugins/smiley/images/shades_smile.gif", from_url="https://www.picklepartysalon.com:8081/devwiki/help-en/html", from_text='', from_type='src', response_code=, response_exc="<twisted.python.failure.Failure twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 111: Connection refused.>"

bylsmad avatar Apr 21 '23 18:04 bylsmad

INFO moin.cli._tests.test_scrapy_crawl:test_scrapy_crawl.py:84 expected {200} got 404 for url="https://www.picklepartysalon.com:8081/users/Home", from_url="https://www.picklepartysalon.com:8081/devwiki/help-en/html", from_text='/users/Home', from_type='href', response_code=404, response_exc="<twisted.python.failure.Failure scrapy.spidermiddlewares.httperror.HttpError: Ignoring non-200 response>"

users/Home is just an example link

INFO moin.cli._tests.test_scrapy_crawl:test_scrapy_crawl.py:92 known issue url="https://www.picklepartysalon.com:8081/+get/help-common/logo.png", from_url="https://www.picklepartysalon.com:8081/devwiki/help-en/markdown", from_text='', from_type='src', response_code=404, response_exc="<twisted.python.failure.Failure scrapy.spidermiddlewares.httperror.HttpError: Ignoring non-200 response>"

This URL works for me image

INFO moin.cli._tests.test_scrapy_crawl:test_scrapy_crawl.py:92 known issue url="http://localhost:8080/+serve/ckeditor/plugins/smiley/images/shades_smile.gif", from_url="https://www.picklepartysalon.com:8081/devwiki/help-en/html", from_text='', from_type='src', response_code=, response_exc="<twisted.python.failure.Failure twisted.internet.error.ConnectionRefusedError: Connection was refused by other side: 111: Connection refused.>"

I can confirm this URL does not work

wagner-intevation avatar Sep 25 '23 20:09 wagner-intevation

@wagner-intevation responses below, I believe these are all issues worth fixing:

I've updated https://www.picklepartysalon.com:8081/devwiki/ to current master and reimported the help pages

CrawlResultMatch(url=Iri(scheme=settings.SITE_SCHEME, authority=settings.SITE_HOST, path='/users/Home'),
                         from_url='/html')  # only with wiki_root

this one is confirmed fixed go ahead and remove from KNOWN_ISSUES

CrawlResultMatch(url=Iri(scheme=settings.SITE_SCHEME, authority=settings.SITE_HOST,
                                 path='/+get/help-common/logo.png'),
                         from_url='/markdown'),  # only with wiki_root

this one is still an issue - on https://www.picklepartysalon.com:8081/devwiki/help-en/markdown the bad logo image is below the text "colors depend upon configuration settings."

CrawlResultMatch(url='http://localhost:8080/+serve/ckeditor/plugins/smiley/images/shades_smile.gif',
                         from_url='/html')

confirmed as you said, this is still an issue in current master, I haven't any idea of a good way to fix this as the CKEditor is creating absolute links, maybe some sort of preprocessing when user submits an edit for an html page?

bylsmad avatar Sep 26 '23 14:09 bylsmad