pylinkvalidator
pylinkvalidator copied to clipboard
Fetch partial links
Is there any way to check the resources specified as relative links in the page?
Thanks!
Pylinkvalidator should follow relative links. Do you have an example where it does not work?
Actually you are right. Relative links are followed. However, the links I found that are not followed are anchor links (href="#whatever")
- Example: https://www.strava.com/ There are some relative links such as: href="#promo-2" that it doesn't seem to be crawled
Hi,
<a data-action='jump-section' href='#promo-2'> is a local link so Pylinkvalidator does not try to find it in the page.
There could be an additional validation to try to find a DOM element with a name or an id within the page though. Not sure when I can work on this.
Hi again,
Thanks for your answer. I've also found using the anchor links as the strategy used for opening content on an iframe.
For example: https://store.nest.com/product/smoke-co-alarm/ contains href="#meet-the-nest-protect". Any suggestion on how to handle those ones?
Thanks!
Hi,
I don't see #meet-the-nest-protect in the page you referred to, but do you mean that clicking on this link would load some content in an iframe? If that's the case, it is likely to be using javascript, in which case pylinkvalidator cannot help.
Just to clarify, if there is a local link such as href="#promo-2", you would want pylinkvalidator to report whether the element exist on the page or not?
Yes, it's using JS to load it. I think we can close this issue (or mark it as a nice to have feature) for reporting local (anchor) links.
Thanks for your help and for building the tool!
@bartdag, If you are interested, I can work on a PR to provide this feature.
I think that Selenium (rendering JS) could help implement this feature.
What do you think?