Furiganaize icon indicating copy to clipboard operation
Furiganaize copied to clipboard

kanji characters that has been furiganaed once may be furiganaed more than once

Open inoueakimitsu opened this issue 2 years ago • 4 comments

Thanks for creating such a useful tool.

I ran Furiganaize on the following text:

作成します。作ります。

Then, I got the following DOM:

<ruby><rb><ruby><rb>作</rb><rt style="">つく</rt></ruby>成</rb><rt style="">さくせい</rt></ruby>します。<ruby><rb>作</rb><rt style="">つく</rt></ruby>ります。

The problem here is duplication of ruby tags.

I have implemented a simple way to fix this.

The modification changes the above output as follows:

<ruby><rb>作成</rb><rt style="">さくせい</rt></ruby>します。<ruby><rb>作</rb><rt style="">つく</rt></ruby>ります。

I am concerned that this modification will adversely affect other processes as I do not yet understand the overall processing flow of this program.

I would be happy to review it.

inoueakimitsu avatar Jun 16 '22 16:06 inoueakimitsu

The solution above is not sufficient.

This cannot correctly furiganize the following sentence:

食べる。飲食店。

Output:

<ruby><rb>食</rb><rt style="">た</rt></ruby>べる。飲<ruby><rb>食</rb><rt style="">た</rt></ruby><ruby><rb>店</rb><rt style="">てん</rt></ruby>。

飲食店 should be いんしょくてん.

I think it may be possible to avoid this problem by processing long Kanji strings in preference to short Kanji strings.

inoueakimitsu avatar Jun 16 '22 16:06 inoueakimitsu

Thanks for your detailed analysis for the behavior of this addon.

  1. I cannot 100% sure that the longer matched furigana (the longer matched ((作成 (さくせい))) vs the shorter matched ((作(つく))(成(なる)))), so... I didn't try to fix this before. At least for a non-native speaker, I feel this "bug" can let me know the 読み方 of more individual / separated kanji... www
  2. I just made some experiments on 「飲食店 飲食店 飲食店 飲食店 飲食店 食べる。飲食店。 食べる。飲食店。 食べる。飲食店。食べる。飲食店。」, and found this Screenshot_20220617_120332
  3. Actually I'm not that clearly understand how to handle with igo.js because this package is merely forked from ilyalissoboi's FuriganaInjectorPlusPlus, I mainly do some bugfixes, UI improvements, and add support for dynamic pages, so I nearly didn't change how does igo.js handle the sentences in Node of DOM... (汗) So, sorry I guess I cannot provide some usable advise about this issue (on the other hand, I indeed has no more free time to debug for this recently, too many tasks needed to be done everyday...)...
  4. But if you're willing to dig in to this bug, it's very welcome still! (I'm considering that if you want to do this, maybe adding an option in options_ui as alpha-phrase testing, to prevent to affect the original users? Just a suggestion if really need to fix this issue.)

kuanyui avatar Jun 17 '22 04:06 kuanyui

Thank you for your reply.

  1. i see. I am actually not 100% sure that the longer matched furigana approach works for sure either. It might be better to collect sentences to be tested first. Also, as for the point that the readings of individual decomposed kanji are instructive, I think that perspective is interesting.

  2. related to 1, you mentioned that if multiple furigana are assigned to the same kanji, it would be a good way to study even if some of the furigana candidates are incorrect?

  3. yes. I greatly appreciate your time and input. I am currently finding the alignment process between the original statement in the DOM and the output of iqo.js particularly difficult, but I will investigate the algorithm on my own for a while longer.

  4. regarding your suggestion of adding an option for alpha-phaese testing to options_ui, I would definitely try that. I will try to create a reference implementation with the addition of the longer matched furigana approach mentioned above.

inoueakimitsu avatar Jun 19 '22 05:06 inoueakimitsu

I've made reference implementation for this issue: https://github.com/kuanyui/Furiganaize/pull/10#issuecomment-1159633625

The following example sentence, which I was trying to correct, appears to work well.

When the option is off: image

When the option is on: image

The text on the options settings screen is as follows: image

Please let me know if I need to modify it.

inoueakimitsu avatar Jun 19 '22 07:06 inoueakimitsu

After months of testing in real-world, no sign of big problem.

Therefore this option is enabled by default since v0.7.0, thanks @inoueakimitsu again! Screenshot_20221108_000451

kuanyui avatar Nov 07 '22 16:11 kuanyui