PlexAniSync
PlexAniSync copied to clipboard
AniDB vs Anilist - add support for Movies and wo/o naming differences
I went through my library and synced everything. I use x-jat names from AniDB and I noticed two naming patterns that should be straightforward to cover, saving a lot of work on custom mappings.
First Pattern - 'Movie'
AniDB Name | Anilist Name |
---|---|
Gekijouban Blood-C: The Last Dark | BLOOD-C: The Last Dark |
Gekijouban Mahouka Koukou no Rettousei: Hoshi o Yobu Shoujo | Mahouka Koukou no Rettousei: Hoshi wo Yobu Shoujo |
Gekijouban xxxHOLiC: Manatsu no Yo no Yume | xxxHOLiC: Manatsu no Yoru no Yume |
Gekijouban Dungeon ni Deai o Motomeru no wa Machigatte Iru Darouka: Orion no Ya | Dungeon ni Deai o Motomeru no wa Machigatte Iru Darouka: Orion no Ya |
PlexAniSync can recognise this word and attempt to do an extra attempt to match title after removing Gekijouban<space>
from the string.
Another similar example is 'Eiga':
AniDB Name | Anilist Name |
---|---|
Eiga Crayon Shin-chan: Mononoke Ninja Chinfuuden | Crayon Shin-chan: Mononoke Ninja Chinfuuden |
Eiga Doraemon: Nobita no Little Star Wars 2021 | Doraemon: Nobita no Little Star Wars 2021 |
Second Pattern - wo vs o
AniDB Name | Anilist Name |
---|---|
Hige o Soru. Soshite Joshikousei o Hirou. | Hige wo Soru. Soshite Joshikousei wo Hirou. |
Sono Bisque Doll wa Koi o Suru | Sono Bisque Doll wa Koi wo Suru |
Seishun Buta Yarou wa Yumemiru Shoujo no Yume o Minai | Seishun Buta Yarou wa Yumemiru Shoujo no Yume wo Minai |
Nakitai Watashi wa Neko o Kaburu | Nakitai Watashi wa Neko wo Kaburu |
Fune o Amu | Fune wo Amu |
AniDB is almost universally done as o
, while Anilist uses wo
in titles. I don't know Japanese well enough to understand why...
PlexAniSync can catch <space>o<space>
in the string and do an extra attempt to match title after convering o into wo. Note top example from the table even has double o
.
While some titles might genuinely use o
in the title, I don't expect them to be a match to a completely different title even if PlexAniSync converts innocent o
into wo
.
I got my hands on AniDB title .xml.gz file and did some top level counting. I discarded all lines from xml except lang="x-jat" and type="main".
I was left with 593 titles:
- 211 titles with 'Gekijouban '
- 261 titles with ' o ' (266 instances, so a few had multiple o o)
- 142 titles with 'Eiga '
Numbers don't add up as Eiga + o or Gekijouban + o happen sometimes.
I did this to do more data checks and to confirm the logic won't be harmful. I spotted some odd cases, please read on.
The wo->o rule
The overwhelming number of examples would be perfect if o became wo.
Some oddities:
- Hit o Nerae!, anidb: 1532 is https://anilist.co/anime/964/Smash-Hit/, has 'Hit o Nerae' as synonym on anilist (not sure if PlexAniSync checks synonyms?), so this is a case where woing won't help to solve name problem
- The Big O (2003), anidb: 8941 is not, in fact, 'The Big Wo! (2003)', https://anilist.co/anime/567/The-Big-O/ but nothing bad will happen.
- (NSFW warning) Ore ga Kanojo o *su Wake, anidb: 13763 is https://anilist.co/anime/101015/Ore-ga-Kanojo-wo-Okasu-Wake/, while rule is valid, it is 'wo' on anilist, it won't help because AniDB censored word okasu. But it's not a problem with the logic at this point...
Gekijouban rule
Some medium disappointment here, I have to go back on my initial assumption.
Here are examples where gekijouban-less title will match to tv show of the same name:
- Gekijouban Violet Evergarden, anidb: 14013 is https://anilist.co/anime/103047/Violet-Evergarden-Movie/, since they weren't too inventive with the name, title will still not match.
- Gekijouban Wakaokami wa Shougakusei!, anidb: 14011 is https://anilist.co/anime/101478/Wakaokami-wa-Shougakusei-Movie/
- Gekijouban Shirobako, anidb: 14061 is https://anilist.co/anime/101574/SHIROBAKO-Movie/
- Gekijouban Argonavis from Bang Dream!, anidb: 15977 is https://anilist.co/anime/128344/ARGONAVIS-from-BanG-Dream-Movie/
Funny outlier: Gekijouban Idol Bu Show, anidb: 17230 is https://anilist.co/anime/145916/IDOL-bu-SHOW-Movie/ but there's no tv show covering the name.
Eiga rule
Not as much as Gekijouban case, but I can find similar issues.
Here are examples where eiga-less title will match to tv show of the same name:
- Eiga Zannen na Ikimono Jiten, anidb:16252 is https://anilist.co/anime/132804/Zannen-na-Ikimono-Jiten-The-Movie/
- Eiga Delicious Party Precure, anidb: 17176 is https://anilist.co/anime/144687/Delicious-PartyPrecure-Movie/
- Eiga no Osomatsu-san, anidb 14293 is https://anilist.co/anime/104213/Osomatsusan-The-Movie/
Other oddities: Komadori Eiga Komaneko, anidb 7306 proves that Eiga needs to be matched from the beginning of the string.
Summary
Wo-ing the titles seems safe and desired.
While all previous examples from my own library would match correct anilist title (after de-gekijoubaning or de-eigaing), there seem to be too many cases where it will cause problems.
Instead, I think it's safer to attempt to do following treatment:
- de-gekijouban or de-eiga the title
- add Movie and (Movie)
- try to match
- give up if nothing found
I attach file with cleaned titles I used for above research: https://gist.github.com/karpik123/760774de1a0a90156567d794a704e71a