mihon icon indicating copy to clipboard operation
mihon copied to clipboard

Different chapters with same name get replaced on Download

Open Anutrix opened this issue 1 year ago • 12 comments

Steps to reproduce

  1. Add a source with seperate chapters with same name. Eg, https://bato.to/series/93186/welcome-to-japan-ms-elf-official Bonus chapters.
  2. Select and download any one chapter with same name, eg. 'Bonus'.
  3. Note that chapters with same name get marked as downloaded. 5 chapters with same name get tick mark when only 1 was downloaded.
  4. Open each and they will have same content too.

Expected behavior

The 'Bonus' chapters are all different and correct in the app. Their download state must also be independent.

Actual behavior

The 'Bonus' chapters are all same in the app. The entries are still seperate i.e, multiple chapters named 'Bonus' appear correctly but they all lead to same content. Download state is same. And deleting one, deletes them all. Read state is independent and correct.

Crash logs

No response

Mihon version

0.17.0

Android version

Android 14

Device

Asus ROG Phone 7 Ultimate

Other details

No response

Acknowledgements

  • [X] I have searched the existing issues and this is a new ticket, NOT a duplicate or related to another open or closed issue.
  • [X] I have written a short but informative title.
  • [X] I have gone through the FAQ and troubleshooting guide.
  • [X] I have updated the app to version 0.17.0.
  • [X] I have updated all installed extensions.
  • [X] I will fill out all of the requested information in this form.

Anutrix avatar Oct 28 '24 09:10 Anutrix

Reverting, supercedes #371 as a more detailed issue request

Smol-Ame avatar Oct 28 '24 09:10 Smol-Ame

just noticed this bug. i hope it gets fixed. maybe add an option to append a random string to the filename?

gitfirl avatar Jan 17 '25 10:01 gitfirl

Yes, I am interested in contributing a fix for this issue. There are a couple of different approaches, but the one I would suggest is suffixing the filename with a short (say 8-char) hash of the chapter URL, given that chapter names are not unique but chapter URLs are unique and should be fairly stable over time.

Of course, code to handle this would need to check for existing downloads in the old filename format, and migrate them automatically.

raxod502 avatar Jan 17 '25 21:01 raxod502

given that chapter names are not unique but chapter URLs are unique and should be fairly stable over time.

Not a safe assumption. Many sources rotate URLs often, which already causes other inconveniences like messing with the history and consequently reading statistics, adding orphaned files taking up storage space to the list doesn't seem like a better outcome.

BrutuZ avatar Jan 17 '25 22:01 BrutuZ

Do we have any reasonable stable way of identifying a chapter over time?

raxod502 avatar Jan 17 '25 22:01 raxod502

If there was, the other problems I mentioned wouldn't still be around. You get a fresh chapter list on every refresh, they have a Title/Name, Number (either parsed or indexed) and URL, the source may change any of those at the drop of a hat, it's the nature of the game.

BrutuZ avatar Jan 17 '25 22:01 BrutuZ

Url is supposed to be the unique identifier of a chapter (for a manga) but as we've established it's not the case for very particular sources but given we already have special data fixing logic for those sources it should be fairly easy to rename downloads too for those.

AntsyLich avatar Jan 18 '25 06:01 AntsyLich

Seems pretty reasonable to me to assume that URL "should" be stable, and have it be part of the API contract for a manga source that if the URLs are not stable then it "should" provide some implementation of an interface for correlating old chapters with new ones from a more recent update. Which we can then apply to the various parts of the database that get messed up when this happens, including download tracking.

Perhaps to start with I can make the new-style download naming be possible to enable or disable in the settings, so that in case of unexpected behavior, nobody is blocked from their favorite use cases.

raxod502 avatar Jan 18 '25 19:01 raxod502

Valid chapter names are fetched from here https://github.com/mihonapp/mihon/blob/c283abefb03f79ce6652492db71cde410f828f78/app/src/main/java/eu/kanade/tachiyomi/data/download/DownloadProvider.kt#L161 The active chapter naming scheme is taken from https://github.com/mihonapp/mihon/blob/c283abefb03f79ce6652492db71cde410f828f78/app/src/main/java/eu/kanade/tachiyomi/data/download/DownloadProvider.kt#L129 And after this point we do data fixing for the faulty sources https://github.com/mihonapp/mihon/blob/c283abefb03f79ce6652492db71cde410f828f78/app/src/main/java/eu/kanade/domain/chapter/interactor/SyncChaptersWithSource.kt#L169 Also use this function to add the unique identifier as suffix https://github.com/mihonapp/mihon/blob/c283abefb03f79ce6652492db71cde410f828f78/app/src/main/java/eu/kanade/tachiyomi/data/download/DownloadManager.kt#L338

AntsyLich avatar Jan 18 '25 19:01 AntsyLich

Unrelated but if you're going to do it for chapter folder/file name might as well do it for manga folder name?

AntsyLich avatar Jan 18 '25 19:01 AntsyLich

Still an issue on v18. Couldn’t we hash the files to identify them? Or maybe that wouldn’t really work because we would have to download the files to compare the hashes which defeats the purpose. Unless everybody starts also including hashes in their sources which we probably should.

scythe000 avatar Apr 03 '25 15:04 scythe000

I have some work in progress for this, but I cannot provide a timeline for when it'll be ready. My approach is to include in the download filename a truncated hash of the chapter URL. This is a relatively simple but generally good-enough modification to resolve the problem without requiring any large-scale changes to the data model.

Hashing the files themselves is not an option, as third-party websites do not publish hashes of arbitrary media files in the same way that, for example, Linux package archives do.

raxod502 avatar Apr 03 '25 20:04 raxod502

I have some work in progress for this, but I cannot provide a timeline for when it'll be ready. My approach is to include in the download filename a truncated hash of the chapter URL. This is a relatively simple but generally good-enough modification to resolve the problem without requiring any large-scale changes to the data model.

Any progress with this? Will this break backwards compatibility or automatically rename old files to new format?

fatotak avatar Jun 15 '25 06:06 fatotak

It's on my list to continue working on it. Any delivered implementation would be backwards compatible, of course.

raxod502 avatar Jun 15 '25 23:06 raxod502

Partial fix PR for this: https://github.com/mihonapp/mihon/pull/2206

fatotak avatar Jun 16 '25 14:06 fatotak