Kavita
Kavita copied to clipboard
Edge Cases for filename parser, detects a book as both Volume and Chapter
Documenting edge cases for the filename parser, as I know @tjarls is making improvements regarding the parser, maybe there are some ideas regarding this and/or he can include in his testing so there is no regression. This is on my live library on current stable version; 0.5.6.0
Correctly shows as volumes, no chapters so these are parsed correctly.
2.5 Dimensional Seduction v01 (2022) (Digital) (1r0n).cbz
7 Billion Needles v01 (2010) (digital) (XRA9).cbr
7thGARDEN v01 (2016) (Digital) (danke).cbz
10 Dance v01 (2019) (Digital) (danke-Empire).cbz
12 Beast v01 (2015) (Digital) (danke-Empire).cbz
86--EIGHTY-SIX v01 (2020) (Digital) (LuCaZ).cbz
1122 - For a Happy Marriage v01 (2019) (Digital) (Shizu).cbz
Alice 19th v01 (2003) (Digital) (ripper).cbz
Anti-Magic Academy - The 35th Test Platoon v01 (2017) (Digital) (danke-Empire).cbz
Angels of Death Episode.0 v01 (2019) (Digital) (LuCaZ).cbz
Chobits - 20th Anniversary Edition v01 (2020) (Digital) (danke-Empire).cbz
I Was Reincarnated as the 7th Prince so I Can Take My Time Perfecting My Magical Ability v01 (2021) (Digital) (danke-Empire).cbz
Persona Q - Shadow of the Labyrinth - Side P3 v01 (2016) (digital) (aKraa).cbz
Persona Q - Shadow of the Labyrinth - Side P4 v01 (2016) (digital) (aKraa).cbz
Shows up as volume AND chapter (chapter is always the number in the series title).
ACCA - 13-Territory Inspection Department v01 (2017) (Digital) (Uasaha).cbz
ACCA - 13-Territory Inspection Department P.S. v01 (2020) (Digital) (LuCaZ).cbz
Brave 10 v01 (2013) (Digital) (Lovag-Empire).cbz
Black Panther and Sweet 16 v01 (2017) (Digital) (danke-Empire).cbz
Boys Over Flowers Season 2 v01 (2017) (Digital) (ripper).cbz
Chillin' in Another World with Level 2 Super Cheat Powers v01 (2021) (Digital) (danke-Empire + nao).cbz
Cyborg 009 v01 (2003) (Digital) (BlurPixel-Empire).cbr
Dementia 21 v01 (2018) (digital) (tunafan).cbr
Golgo 13 v01 - Supergun (2006) (Digital) (danke-Empire).cbz
GTO - 14 Days in Shonan v01 (2022) (Digital) (danke-Empire).cbz
Level 1 Demon Lord & One Room Hero v01 (2021) (Digital) (danke-Empire).cbz
My Unique Skill Makes Me OP Even at Level 1 v01 (2021) (Digital) (danke-Empire) (F).cbz
Otherworldly Munchkin - Let's Speedrun the Dungeon with Only 1 HP! v01 (2020) (Digital) (danke-Empire) (F).cbz
Rosario+Vampire Season 2 v01 (2010) (Digital) (LostNerevarine-Empire).cbz
Samurai 8 v01 (2020) (Digital) (danke-Empire).cbz
Samurai 8 - The Tale of Hachimaru v01 (2020) (Digital) (danke-Empire).cbz
Saving 80,000 Gold in Another World for my Retirement v01 (2019) (Digital) (danke-Empire).cbz
Sekirei v19 - 365 Days Without Her (2018) (Digital) (RedRain).cbz
Sex Ed 120% v01 (2021) (Digital) (danke-Empire).cbz
Higurashi When They Cry - Arc 1 - Abducted by Demons Arc v01 (2008) (Digital) (LuCaZ).cbz
Higurashi When They Cry - Arc 9 - Dice Killing Arc (2014) (Digital) (LuCaZ).cbz
Umineko When They Cry - Episode 1 - Legend of the Golden Witch v01 (2-in-1 Edition) (2012) (Digital SD) (Ushi) (Spreads Joined).cbz
Umineko When They Cry - Episode 8 - Twilight of the Golden Witch v01 (3-in-1 Edition) (2019) (Digital SD) (Ushi) (Spreads Joined).cbz
JoJo's Bizarre Adventure - Part 1 - Phantom Blood v01 (2014) (Digital) (BlackManta-Empire).cbz
JoJo's Bizarre Adventure - Part 5 - Golden Wind v01 (2021) (Digital) (1r0n) (f2).cbz
Ascendance of a Bookworm - Part 01 v01 (2021) (Digital) (Ushi).cbz
Ascendance of a Bookworm - Part 02 v01 (2021) (Digital) (Ushi).cbz
Rose Guns Days - Season 1 v01 (2015) (Digital) (LuCaZ).cbz
Rose Guns Days - Season 2 v01 (2016) (Digital) (LuCaZ).cbz
Showa - A History of Japan v01 - 1926-1939 (2013) (Digital) (XRA-Empire).cbz
Showa - A History of Japan v02 - 1939-1944 (2014) (Digital) (XRA-Empire).cbz
Showa - A History of Japan v03 - 1944-1953 (2014) (Digital) (XRA-Empire).cbz
Showa - A History of Japan v04 - 1953-1989 (2015) (Digital) (XRA-Empire).cbz
Path of the Assassin v09 - Battle for Power Part 1 (2008) (Digital) (Lovag-Empire).cbz
Path of the Assassin v10 - Battle for Power Part 2 (2008) (Digital) (Lovag-Empire).cbz
Extra Edge cases
Kaiju No. 8 v01 (2021) (Digital) (1r0n) (f).cbz -- volumes show also up as chapter 8
No. 5 v01 (2021) (Digital) (1r0n) (f2).cbz -- volumes show also up as chapter 5
No. 6 v1 (2015) (Digital) (ripper).cbz -- -- volumes show also up as chapter 6
ReZERO -Starting Life in Another World- Chapter 1 - A Day in the Capital v01 (2016) (Digital) (LuCaZ).cbz
ReZERO -Starting Life in Another World- Chapter 2 - A Week at the Mansion v01 (2017) (Digital) (LuCaZ).cbz
ReZERO -Starting Life in Another World- Chapter 3 - Truth of Zero v01 (2017) (Digital) (LuCaZ).cbz
Ripper name c1fi7
Any filename with (c1fi7)
or (Mr. Kimiko-c1fi7)
will always show up as a given volume but also always as chapter 1, though the ripper is also between parentheses. And yes, there are currently quite some ripped manga with this rippers tag.
Interesting enough when having No. 6 v1 (2015) (Digital) (c1fi7).cbz
(actual case) then the volumes show also as chapter 1, if removing this particular rippers name, then they will show up as also chapter 6 instead of 1.
Other parsing issues #1336
Ideally would be something like this; If a volume (with v | vol | volume) is found with a loose (unmarked) number, and that number does not have a chapter marker (ch, chapter, c) and is before the [first] volume marker, assume it is part of the series title. -- the consensus would be then that unmarked chapter numbers always have to appear after the volume marker, or have it marked.
Did some more tests with chapters only.
The following parse the chapters correctly without chapter marker;
4 Cut Hero 002 (Digital) (Cobalt001).cbz
4 Cut Hero 073 (Digital) (Cobalt001).cbz
4 Cut Hero 104 (Digital) (Cobalt001).cbz
Kaiju No. 8 030 (2021) (Digital) (1r0n).cbz
Kaiju No. 8 041 (2021) (Digital) (1r0n).cbz
Kaiju No. 8 070 (2022) (Digital) (anadius).cbz
12 Years Apart 009 (2021) (Digital) (1r0n).cbz
12 Years Apart 023 (2021) (Digital) (1r0n).cbz
12 Years Apart 031 (2021) (Digital) (1r0n).cbz
15 Minutes 002 (2018) (Digital) (repressedrage).cbz
15 Minutes 016 (2018) (Digital) (repressedrage).cbz
15 Minutes 025 (2018) (Digital) (repressedrage).cbz
However, the following does not, and shows up as chapter 20, all merged into one.
Twenty 20 005 (2021) (Digital) (YameteOnii-sama).cbz
Twenty 20 017 (2021) (Digital) (YameteOnii-sama).cbz
Twenty 20 034 (2022) (Digital) (YameteOnii-sama).cbz
the same for the following, all show as chapter 5 merged
Battle in 5 Seconds 047 (2021) (Digital).cbz
Battle in 5 Seconds 054 (2021) (Digital).cbz
Battle in 5 Seconds 059 (2021) (Digital).cbz
To resolve the latter two, I have started appending c (chapter marker) for each chapter. So the general solution for having series with a number that has loose chapters, always append a chapter marker so the parser has no confusion.
Another tricky edge case, having 'part x' in the sub-title of a volume.
Path of the Assassin v09 - Battle for Power Part 1 (2008) (Digital) (Lovag-Empire).cbz
Path of the Assassin v10 - Battle for Power Part 2 (2008) (Digital) (Lovag-Empire).cbz
Show up as volume 9 and 10 but also as chapter 1 and 2. In this case maybe if 'part' is in front of the number, ignore as chapter. usually this is part of the title or sub-title anyway and afaik not used as chapter marker? An example of it being in the title is;
Ascendance of a Bookworm - Part 01 v01 (2019) (Digital) (Ushi).cbz
...
Ascendance of a Bookworm - Part 02 v01 (2021) (Digital) (Ushi).cbz
each volume of Part 1, also shows as chapter 1, while each volume of Part 2 shows as chapter 2.
In this case the part doesn't make sense, but I'm webtoon I've seen parts used for chapter numbers. We'd need to investigate and prove it's not as common a case so we get the most benefit from file parsing (esp webtoon usually have even less metadata than manga)
So far for all the manga I have, the Path of the Assassin had Part x in the sub-title. the issue can be easily solved by just removing it from the filename. I just wanted to document that it is a possibility that it happens.
The only manga so far that has Part in the series title is Ascendence of a Bookworm, mimicking the Light Novel sub-titles, and Jojo's Bizarre Adventure.
I'll keep an eye out at my webtoons but so far I do not believe I have come across this, the officially translated rips so far have matched the naming scheme of ripped manga. Edit; I must say that it is a nyaa naming scheme 'standard'ish
I did a quick test with ComicInfo, the first test with
Kaiju No. 8 v01 (2021) (Digital) (1r0n) (f).cbz
<ComicInfo>
<Title>Kaiju No. 8 volume 1</Title>
<Series>Kaiju No. 8</Series>
<Count>2</Count>
<Volume>1</Volume>
</ComicInfo>
Resulted in that still also a chapter 8 shows up, in addition to the volume 1. Count was used to check if Kavita read the ComicInfo setting the series to completed.
A Bride's Story v02 (2011) (Digital) (c1fi7).cbz
<ComicInfo>
<Title>A Bride's Story volume 2</Title>
<Series>A Bride's Story</Series>
<Count>2</Count>
<Volume>2</Volume>
</ComicInfo>
Still showed up as chapter 1 due to the (c1fi7) in the filename, even if there is ComicInfo.xml. Removing (c1f17) from the filename and rescanning the series, removed the chapters as expected. This means that regardless of ComicInfo, the filename parser still tries to parse out chapters.
It's not count, it's number. You used the wrong field.
I only used Count to have it shown up as "completed" in Kavita, to make sure it read the ComicInfo.xml. I double checked with the wiki If you have at least one "Count" defined within any ComicInfo from the series, and it is not 0, then Kavita will assume the Series is Completed. Otherwise, it will be assumed Ongoing.
.
Or if you meant to use Number instead of Volume? but that would make less sense as any perfect normal series (without a number) would even show up as both a volume and chapter, if Number is set in ComicInfo.
I misread your post. You're talking about adding comicinfo without issue number to validate the issue number from parser remains, which of course since comicinfo overrides what filename parser finds.
On Sat, Sep 17, 2022, 6:14 AM Ocgineer @.***> wrote:
I only used Count to have it shown up as "completed" in Kavita, to make sure it read the ComicInfo.xml. I double checked with the wiki If you have at least one "Count" defined within any ComicInfo from the series, and it is not 0, then Kavita will assume the Series is Completed. Otherwise, it will be assumed Ongoing..
— Reply to this email directly, view it on GitHub https://github.com/Kareadita/Kavita/issues/1534#issuecomment-1250052128, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTU24QVB3A3SFKNILCN4DV6WR3DANCNFSM6AAAAAAQNY3HAU . You are receiving this because you commented.Message ID: @.***>
Right, I thought that the filename parser was completely ignored if ComicInfo was present, testing thus using ComicInfo for [manga volumes] that has a number in the series name, would resolve the issue, which it does not then unfortunately.