zim-tools icon indicating copy to clipboard operation
zim-tools copied to clipboard

zimdump dump misses html files when file name conflict with directory name

Open nickhuang99 opened this issue 8 months ago • 2 comments

Let's use "real" wiki page of "C++" as an example: "https://en.wikipedia.org/wiki/C++" is a html page and it has some sub pages under the directory "C++": https://en.wikipedia.org/wiki/C++/CLI This situation cannot be represented in dump static html files because "C++" cannot be a html file and directory at same time. How should zimdump generate redirect is not so easy, especially when "C++" is url-escaped as "C%2B%2B". Then redirect URL has to include a safe-URL encoded to check if actual filesystem directory "C++" actually exists. To give a real test case, please download "kiwix" computer zim from: https://download.kiwix.org/zim/wikipedia/wikipedia_en_computer_maxi_2024-05.zim

And when you zimdump it, you will see "C++" html page is missing because "C++" is a directory to hold "CLI" page in filesystem.

nickhuang99 avatar Jun 14 '24 23:06 nickhuang99