Furiganaize
Furiganaize copied to clipboard
kanji characters that has been furiganaed once may be furiganaed more than once
Thanks for creating such a useful tool.
I ran Furiganaize on the following text:
作成します。作ります。
Then, I got the following DOM:
<ruby><rb><ruby><rb>作</rb><rt style="">つく</rt></ruby>成</rb><rt style="">さくせい</rt></ruby>します。<ruby><rb>作</rb><rt style="">つく</rt></ruby>ります。
The problem here is duplication of ruby tags.
I have implemented a simple way to fix this.
The modification changes the above output as follows:
<ruby><rb>作成</rb><rt style="">さくせい</rt></ruby>します。<ruby><rb>作</rb><rt style="">つく</rt></ruby>ります。
I am concerned that this modification will adversely affect other processes as I do not yet understand the overall processing flow of this program.
I would be happy to review it.
The solution above is not sufficient.
This cannot correctly furiganize the following sentence:
食べる。飲食店。
Output:
<ruby><rb>食</rb><rt style="">た</rt></ruby>べる。飲<ruby><rb>食</rb><rt style="">た</rt></ruby><ruby><rb>店</rb><rt style="">てん</rt></ruby>。
飲食店
should be いんしょくてん
.
I think it may be possible to avoid this problem by processing long Kanji strings in preference to short Kanji strings.
Thanks for your detailed analysis for the behavior of this addon.
- I cannot 100% sure that the longer matched furigana (the longer matched
((作成 (さくせい)))
vs the shorter matched((作(つく))(成(なる)))
), so... I didn't try to fix this before. At least for a non-native speaker, I feel this "bug" can let me know the 読み方 of more individual / separated kanji... www - I just made some experiments on 「飲食店 飲食店 飲食店 飲食店 飲食店 食べる。飲食店。 食べる。飲食店。 食べる。飲食店。食べる。飲食店。」, and found this
- Actually I'm not that clearly understand how to handle with
igo.js
because this package is merely forked from ilyalissoboi's FuriganaInjectorPlusPlus, I mainly do some bugfixes, UI improvements, and add support for dynamic pages, so I nearly didn't change how doesigo.js
handle the sentences in Node of DOM... (汗) So, sorry I guess I cannot provide some usable advise about this issue (on the other hand, I indeed has no more free time to debug for this recently, too many tasks needed to be done everyday...)... - But if you're willing to dig in to this bug, it's very welcome still! (I'm considering that if you want to do this, maybe adding an option in
options_ui
as alpha-phrase testing, to prevent to affect the original users? Just a suggestion if really need to fix this issue.)
Thank you for your reply.
-
i see. I am actually not 100% sure that the longer matched furigana approach works for sure either. It might be better to collect sentences to be tested first. Also, as for the point that the readings of individual decomposed kanji are instructive, I think that perspective is interesting.
-
related to 1, you mentioned that if multiple furigana are assigned to the same kanji, it would be a good way to study even if some of the furigana candidates are incorrect?
-
yes. I greatly appreciate your time and input. I am currently finding the alignment process between the original statement in the DOM and the output of iqo.js particularly difficult, but I will investigate the algorithm on my own for a while longer.
-
regarding your suggestion of adding an option for alpha-phaese testing to
options_ui
, I would definitely try that. I will try to create a reference implementation with the addition of the longer matched furigana approach mentioned above.
I've made reference implementation for this issue: https://github.com/kuanyui/Furiganaize/pull/10#issuecomment-1159633625
The following example sentence, which I was trying to correct, appears to work well.
When the option is off:
When the option is on:
The text on the options settings screen is as follows:
Please let me know if I need to modify it.
After months of testing in real-world, no sign of big problem.
Therefore this option is enabled by default since v0.7.0
, thanks @inoueakimitsu again!