10ten-ja-reader icon indicating copy to clipboard operation
10ten-ja-reader copied to clipboard

Support importing Yomichan style dictionaries.

Open xrishox opened this issue 3 years ago • 17 comments

Yomichan has a huge community of users that are making all sorts of interesting and useful dictionaries. They have stuff like frequency dictionaries, pitch accent dictionaries, common kanji reading heuristic dictionaries, monolingual dictionaries and plenty of other things. If you could support importing yomichan dictionaries you would be able to pick up all of this stuff basically for free without any other work on your side.

xrishox avatar Jan 02 '23 18:01 xrishox

Thanks for this idea! Sorry to take so long to respond.

Do you have details to the specific format you'd like to see supported? I can certainly see adding support for the frequency dictionaries being quite achievable.

At one point one of our Japanese engineers did a very thorough investigation into supporting the EPWING format but determined that all dictionary publishers stopped supporting it at least 10 years ago (some stopped supporting it as far back as 1998!). The currently available EPWING content is either generated from other sources or both old and illegal. We have a plan to support recently published dictionary content including monolingual dictionaries but it's still quite a way off unfortunately.

birtles avatar Jan 06 '23 08:01 birtles

Is there any update on monolingual support? Trying to transition to it currently.

raayu83 avatar Aug 26 '23 00:08 raayu83

Hi! I'm afraid it's still a while off but it's definitely on the roadmap. Thank you for your patience!

birtles avatar Aug 26 '23 00:08 birtles

Thanks for your answer! Definitly looking forward to this! Will this also be included in the iOS version?

raayu83 avatar Aug 26 '23 20:08 raayu83

Yes, everything we do will be for all platforms including iOS.

birtles avatar Aug 28 '23 00:08 birtles

I think that using EPWING dictionaries is the most straightforward way.

One way of "circumventing" the legality of this is by just allowing to sideload them as Yomichan did. Maybe via URL in order to prevent issues with iOS.

danpaldev avatar Sep 20 '23 19:09 danpaldev

I think that using EPWING dictionaries is the most straightforward way.

One way of "circumventing" the legality of this is by just allowing to sideload them as Yomichan did. Maybe via URL in order to prevent issues with iOS.

+1, having this implemented might even make it possible for me to transition to Safari on desktop since yomichan is the only reason I really need a chromium browser now

AuroraWright avatar Oct 12 '23 04:10 AuroraWright

Are there any updates on this feature?

mbugti04 avatar Aug 08 '24 21:08 mbugti04

Thanks for this idea! Sorry to take so long to respond.

Do you have details to the specific format you'd like to see supported? I can certainly see adding support for the frequency dictionaries being quite achievable.

At one point one of our Japanese engineers did a very thorough investigation into supporting the EPWING format but determined that all dictionary publishers stopped supporting it at least 10 years ago (some stopped supporting it as far back as 1998!). The currently available EPWING content is either generated from other sources or both old and illegal. We have a plan to support recently published dictionary content including monolingual dictionaries but it's still quite a way off unfortunately.

honestly yomichan dictionaries are pretty simple, as it's html after parsing it should be pretty easy to render, you might even be able to just fork that section of yomichan without too much effort

aramrw avatar Aug 09 '24 11:08 aramrw

Sorry, no updates here yet. It's still very much on the roadmap—it's one of my three major goals for this year but unfortunately progress has slowed a lot recently since I had my first child earlier in the year. Thanks for your patience and sorry again.

birtles avatar Aug 13 '24 05:08 birtles

just curious what are the other 2 major goals for the year?

xrishox avatar Aug 20 '24 19:08 xrishox

just curious what are the other 2 major goals for the year?

  1. Lookup bar / omnibox support (maybe sidebar too)
  2. Flashcard generation (for use with Anki etc.)

Although currently I'm working on the kanji stroke animation feature.

birtles avatar Aug 21 '24 06:08 birtles

Hi @birtles

First of all congratulations on the birth of your first child, that's very exciting!

Regarding support for Yomichan dictionaries and exporting flash cards to Anki, I would like to provide you with some basic information to help you out. Neither of these features would be difficult to implement, you could knock them both out in an afternoon if you're already familiar with either of them. My hope in writing this comment is to help give you a head start and save you some time from research into it.

Regarding Yomichan dictionaries, they are very simple. The dictionaries come as a zip file which you upload into Yomichan. If you were to unpack that zip folder it contains a collection of json files and inside of those are just arrays which contain the information on each word.

Here is an example:

IMG_2158

*The dictionary I used for this example includes examples as part of the definition, that's not standard it's just a feature of this particular dictionary.

Yomichan simply parses the json data and displays it in HTML format. There are many different dictionaries that you can use, but they all follow the same formatting so it should be easy to implement.

Yomichan has some other dictionary formats such as frequency dictionaries which merely display a number to represent frequency of the word. The specifics vary depending on the specific dictionary but typically the lower a number is the more frequent the word is (there are a handful where this is the opposite, but that has absolutely nothing to do with the code and is a matter of how the dictionary was compiled.

I would recommend that you download the Yomichan edition of Jmdict, extract it and start poking around. I would imagine you could have a rudimentary version of this up and running within an hour.

There is a community known as The Moe Way (their website is LearnJapanese.moe) and users there have a large collection of dictionaries that they share. If you consult their collection you should be able to see all of the different formats such as frequency dictionaries, grammar dictionaries, kanji dictionaries, etc. Oh yeah, some dictionaries also have images!

--

As for Anki, that is also easy. There is an add-on for Anki called Anki Connect which exposes Anki on port 8765. Anki card creation (whether it be through Yomichan or other applications which utilize Anki Connect) simply send the relevant data over that way. For the params you need to pass in the deck name, note type (users in Anki can define custom note types), and the fields along with the data that will be passed into those fields. These should all be defined by the user in settings before they try to make cards as it is all dependent on their Anki deck settings.

Like Yomichan dictionaries, this is very simple. I would recommend having a look at the official documentation, it should give you a better idea.

https://git.foosoft.net/alex/anki-connect

Anki Connect will work for users on Windows, Mac, and there is even a version for Android phones, but you will need to figure out another solution for iPhone users as there isn't a version of Anki Connect on iOS (and I don't think it's possible to make one?). The way that the Immersion Reader app has gotten around this is that cards are simply exported as CSV files and those can be directly imported into the Anki app for iOS. Immersion Reader allows users to save their cards into a database and then export that database as a CSV whenever they're ready to add the words to Anki. This would probably be the easiest approach but you wouldn't be able to include things like images so if you want it to be as fully featured as Yomichan is on Android than you will either need to talk with the Anki dev and figure out a solution (he can extend functionality of the Anki app if possible) or come up with something else like exporting as a proprietary file type and importing that via a custom Anki extension on desktop the Immersion Reader dev was considering something similar).

Supporting Anki card creation on iOS is absolutely possible, but it's somewhat limited and will require some creative thinking. Any solution that you implement would be more than welcome by iOS users as Immersion Reader is currently the only tool available (and web browsing with 10Ten is a much better experience). Worst case scenario you're limited to text.

--

I hope this helps! There is enough documentation into both of these that even ChatGPT can write solutions for you. Earlier this week I had ChatGPT write a Python script for parsing through subtitles files and generating cards in Anki, so you could try consulting it if you need any more help.

itapun avatar Sep 20 '24 13:09 itapun

if ankiconnect was implemented in a way that you could set the ip/port then on ios you could just connect to your desktop ankiconnect over the lan/vpn. i think both of these ideas would be a very good idea. it lets you piggyback on all of the infrastructure that has already been built rather than needing to rebuild it yourself. it should also increase compatibility across tools.

xrishox avatar Sep 21 '24 13:09 xrishox

@xrishox cc @birtles

Thank you for reminding me! I completely forgot, but Anki Connect CAN be exposed to the local network to do exactly what you just described. There is an iOS app called Kantan Manga which uses OCR to allow users to look up words from manga in Yomichan format dictionaries and a few years ago they implemented a mining function into their beta version which did exactly as you just described. Cards were created and stored locally, and then could be synced with the Desktop version of Anki later when you get home. It required adjusting one line in the Anki Connect settings on Desktop (extremely easy). This also allowed images to be passed into one of the card fields.

Considering that I used this function extensively, I can't believe I forgot that was a possibility! Unfortunately that dev became busy in his personal life and that feature never made it into the version on the app store.

itapun avatar Sep 21 '24 14:09 itapun

I guess this is a very low priority issue but I'm hoping to see more coming from this soon.

itapun avatar Nov 12 '24 06:11 itapun