python-scraperlib icon indicating copy to clipboard operation
python-scraperlib copied to clipboard

Add support for alias (in addition to redirection)

Open benoit74 opened this issue 1 year ago • 4 comments

Just like we have add_redirect in zim/creator and add_redirects_to_zim in zim/filesystem, we should now add support for the "new" ZIM alias with add_alias in zim/creator and add_aliases_to_zim in zim/filesystem

benoit74 avatar Mar 21 '24 20:03 benoit74

I'd like us to implement this as zimwriterfs does (I don't think it's there yet) so this features keeps being a close-to-drop-in alternative (we should implement a drop-in zimwriterfs script at some point). If we are to implement this first, maybe coordinate with @kelson42 so the input can be the same.

rgaudin avatar Mar 22 '24 07:03 rgaudin

@kelson42 three questions:

  • do you have any plans to add support for aliases in zimwriterfs?
  • do you intend to use a different format that the one used for redirects? (libzim API seems identical)
  • shouldn't we add support for isFront hint in the TSV file for both redirect and alias? (covering all hints is pretty difficult but this one is pretty important is most scenarii)

benoit74 avatar Mar 25 '24 09:03 benoit74

  • do you have any plans to add support for aliases in zimwriterfs?

No, not even a ticket! :(

  • do you intend to use a different format that the one used for redirects? (libzim API seems identical)

I guess you refer to the --redirects option, but this is actually not the only way to create redirects in a ZIM. If you have an HTML file with an HTML redirect, then it will create a redirect as well.

I believe that checking the files (symlink, hardlinks, same files) would be the first approach to create aliases.

If there is a clear need to support a similar option like --redirects for aliases, then I guess we will have to implement it.

  • shouldn't we add support for isFront hint in the TSV file for both redirect and alias? (covering all hints is pretty difficult but this one is pretty important is most scenarii)

AFAIK (hope I'm right here) everything is Front for zimwriterfs... so not sure about the need, but here again if there is a need, we will have too.

kelson42 avatar Mar 25 '24 10:03 kelson42

OK, then let's remove the milestone for now until the needs become clearer. I don't even remember when I found this need. Probably around Youtube or TED scrapers ... not sure.

benoit74 avatar Jun 11 '24 11:06 benoit74