kanji-koohii icon indicating copy to clipboard operation
kanji-koohii copied to clipboard

Support additional kanji sequences such as KKLD or RTK Lite

Open fabd opened this issue 8 years ago • 9 comments

Since RTK 6th edition support was added in Dec 2014 the website can, in theory, support any arbitrary sequence of kanji for the Study pages.

It just happens that RTK 5th and 6th editions are very similar, and the site has been focused on RTK. With a bit of work it's possible to remove some of the hard coded designs (like the Progress page including RTK 1 and RTK 3).

Related forum thread

Implementation (draft)

  • The RTK Edition page in Account > Settings would be renamed and moved to a more visible place.

  • The kanji sequence is now a goal. And corresponds to a *progress bar * displayed on the Home page. For example RTK Lite: [ progress bar ..... ] 560 of 1000

  • The Study pages can be browsed in sequence order (already implemented).

  • The default goal would be RTK 6th edition for new accounts, or the user is presented with the choice on the Home page. So first thing when they sign up, on the Home page where the goal progress bar shows, it would say: Choose your goal: (RTK, RTK Lite, KKLD, etc.).

  • What's pretty awesome is that if a user completed, say, "RTK Lite", when that goal is 100% complete they can switch to "RTK" and that goal will now be 50% complete.

Breakdown

Phase 1

First rough implementation: user can select the additional sequence, and then should be able to navigate Study in sequence order.

  • [x] Gather the data into a spreadsheet : SEQUENCE NR (starts at 1), KANJI (and/or UCS-2 code), KEYWORD (optional, won't be used in first implementation). Update Google Spreadsheet thanks to Katsuo.
    • [ ] Load that up into the kanjis table (each index is a column named like idx_<label> and SQL queries basically map to one of these columns in JOINs. As long as there are just 4-5 sequences built in I think the indexes are fine.
  • [ ] Add the sequence to the RTK Edition page (a third radio button)

Phase 2

Involves changing the concept of RTK Edition to a more generic and useful concept of KANJI GOAL.

A controversial change here involves ditching the builtin / hardcoded RTK Lessons. This helps expanding the site's usefulness by being less RTK centric. Not all kanji goals will have lessons and designing lessons is something I can't do. As well as cause potential copyright issues since adding lessons breakdown from a book is a step beyond just supporting a sequence of kanji.

  • [ ] Rename the "RTK Edition" page to something like "Kanji Goal". (Now RTK 5th / 6th editions are two builtin goals.)
  • [ ] Remove "Check Progress" page entirely (controversial, to be discussed)
  • [ ] Display the kanji goal (ie. "RTK Lite") + progress bar on Home screen.

Phase 3

Improve the flow for new users:

  • [ ] A new user will be presented with the choice of KANJI GOAL on the home page, right after their first sign in. This will take them to the Kanji Goal page (ie. currently "RTK Edition"), where they will likely pick between 5th / 6th RTK edition, but could also select RTK Lite, or KKLD, etc.

Next Steps

  • Discuss which sequence to start with: KKLD or RTK Lite? (vote below!)
  • Gather the data in a spreadsheet online

fabd avatar Feb 10 '17 11:02 fabd

Some thoughts:

  1. Would love to see a movie method index (though it's already too late for me to use that).
  2. A link to a page where each index is explained would be a plus for new users.
  3. Goals could have checkpoints (they could be hardcoded or reside in a table that's only queried when loading the "Check Progress" page)... but it'd still suffer from the copyright problem.

faneca avatar Feb 12 '17 07:02 faneca

A link to a page where each index is explained would be a plus for new users.

That will be handled the equivalent of today's "RTK Edition" page. The page where you pick the sequence is where they will be explained.

Goals could have "checkpoints"

By copyright problem I assume you mean using "lessons" if there are in other books / methods? Are thre lessons in KKLD ?

Actually I've been contemplating removing lessons altogether to simplify the Study page header for mobile, as well as make the site more flexible for the other sequences. Not much use in defaulting to a single lesson sequence of hundreds of characters when we don't have a built in lesson. (eg. RTK Vol. 3 is just lesson 57 on the website...)

But you gave me an idea... Why not just arbitrarily slice up the sequence in smaller chunks. Hence, checkpoints. While they are less meaningful that the ones from Heisig which are based on introducing primitives, they would still work as motivation.

We could let the user pick their desired "checkpoint" threshold. For example: 10, 15, 20. If someone wants to try to study 10 a day, they could use a 10 kanji checkpoint. Or they can pick one based on their pace.

Those checkpoints only make sense when studying in sequence. But then again the point of adding more sequences is so you don't need to jump back and forth anymore (eg. RTK Lite).

fabd avatar Feb 12 '17 15:02 fabd

Yes, I was talking about the same concept as "lessons", really, while having on my mind a broader one (for some of the methods don't have "lessons" as such). Sorry for the confusion (but glad that gave you an idea ¬_¬; )

faneca avatar Feb 13 '17 19:02 faneca

I'd suggest, as an alternative Kanji sequence, the WaniKani sequence: https://www.wanikani.com/api WaniKani's service is reading-only: there is no way to be prompted with the keywords and practice writing. They have no current plans of adding a writing component, so I think many WK users would use koohii to complement their study, so that they have a writing SRS (koohii) and a reading SRS (WK).

jjannone avatar Mar 02 '17 13:03 jjannone

@jjannone Would I be authorized to use their sequence? I have no idea about it. Should check out the site. We would need a data sheet with the index > kanji (or UCS code).

fabd avatar Mar 02 '17 16:03 fabd

Hi Fabrice,

I can find out about authorization to use the sequence if you’d like; they do provide an API; I included a link to it in my initial post.

The API can download their sequence as JSON, chapter by chapter; below are the first few Kanji in their “chapter 10."

John

{"user_information":{"username":"Jannone","gravatar":"8d530181ecfdabb5bf72869daa6d3231","level":4,"title":"Turtles","about":"","website":"http://jann.one","twitter":"J_J_A_J","topics_count":0,"posts_count":0,"creation_date":1480733262,"vacation_date":null},"requested_information":[{"character":"農","meaning":"farming, agriculture","onyomi":"のう","kunyomi":null,"important_reading":"onyomi","level":10,"nanori":null,"user_specific":null},{"character":"鳴","meaning":"chirp","onyomi":"めい","kunyomi":"な","important_reading":"kunyomi","level":10,"nanori":null,"user_specific":null},{"character":"集","meaning":"collect, gather","onyomi":"しゅう","kunyomi":"あつ.まる","important_reading":"onyomi","level":10,"nanori":null,"user_specific":null},{"character":"酒","meaning":"alcohol","onyomi":"しゅ","kunyomi":"さけ, さか","important_reading":"onyomi","level":10,"nanori":null,"user_specific":null},{"character":"速","meaning":"fast","onyomi":"そく","kunyomi":"はや.い","important_reading":"onyomi","level":10,"nanori":null,"user_specific":null},{"character":"業","meaning":"business","onyomi":"ぎょう","kunyomi":null,"important_reading":"onyomi","level":10,"nanori":null,"user_specific":null},{"character":"院","meaning":"institution","onyomi":"いん","kunyomi":null,"important_reading":"onyomi","level":10,"nanori":null,"user_specific":null}, ...

On Mar 2, 2017, at 11:08 AM, Fabrice D. [email protected] wrote:

@jjannone https://github.com/jjannone Would I be authorized to use their sequence? I have no idea about it. Should check out the site. We would need a data sheet with the index > kanji (or UCS code).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fabd/kanji-koohii/issues/70#issuecomment-283696753, or mute the thread https://github.com/notifications/unsubscribe-auth/AYJ_e_FmyCuL4h5diAL5GoXcnG1uyKmHks5rhumCgaJpZM4L9QzN.

jjannone avatar Mar 02 '17 19:03 jjannone

Re: WaniKani sequence

Okay I had a quick look. So if I understand there are 60 "levels", equivalent to RTK "lessons"?

However I reviewed a few radicals and couldn't test the kanji but I'm guessing the grid with all the opurple boxes and characters in it is what the sequence is.

This makes me realize I shouldn't ditch the concept of lessons or "levels" since that can be a helpful marker for users to find their way around.

To add WaniKani sequence eventually I need the data in a sheet (csv/tabs) form: index_nr, kanji (or UCS-2 code), lesson. I otherwise have too many things on my plate atm, so I can't invest time figuring out their JSON data. But, there may already be a spreadsheet somewhere.

fabd avatar Mar 02 '17 22:03 fabd

So if I understand there are 60 "levels", equivalent to RTK "lessons"?

Correct.

grid with all the opurple boxes and characters in it is what the sequence is.

Yes.

lessons or "levels" since that can be a helpful marker for users to find their way around.

Definitely — helps one stay in sync across multiple systems.

To add WaniKani sequence eventually I need the data in a sheet (csv/tabs) form: index_nr, kanji (or UCS-2 code), lesson. I otherwise have too many things on my plate atm, so I can't invest time figuring out their JSON data. But, there may already be a spreadsheet somewhere.

I’ll ask them about licensing, see if there is a spreadsheet, and, ifneedbe, I can parse the JSON.

All the best,

John

On Mar 2, 2017, at 5:26 PM, Fabrice D. [email protected] wrote:

Okay I had a quick look. So if I understand there are 60 "levels", equivalent to RTK "lessons"?

However I reviewed a few radicals and couldn't test the kanji but I'm guessing the grid with all the opurple boxes and characters in it is what the sequence is.

This makes me realize I shouldn't ditch the concept of lessons or "levels" since that can be a helpful marker for users to find their way around.

To add WaniKani sequence eventually I need the data in a sheet (csv/tabs) form: index_nr, kanji (or UCS-2 code), lesson. I otherwise have too many things on my plate atm, so I can't invest time figuring out their JSON data. But, there may already be a spreadsheet somewhere.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/fabd/kanji-koohii/issues/70#issuecomment-283802749, or mute the thread https://github.com/notifications/unsubscribe-auth/AYJ_e__3HZtWfqBS4MaGmP5q74_iIeBmks5rh0IWgaJpZM4L9QzN.

jjannone avatar Mar 02 '17 22:03 jjannone

Sounds good.

This is relatively easy to implement at first and to test, although I am not sure my solution is great.

When the new RTK edition came out, I had to update all SQL queries. The solution I end up using is to refer to a different column name:

CREATE TABLE `kanjis` (
  `ucs_id`       SMALLINT UNSIGNED NOT NULL,
  `keyword`      CHAR(32) NOT NULL DEFAULT '',
  `kanji`        CHAR(1) NOT NULL DEFAULT '',
  `onyomi`       VARCHAR(50) NOT NULL DEFAULT '',
  `idx_olded`    SMALLINT UNSIGNED NOT NULL,
  `idx_newed`    SMALLINT UNSIGNED NOT NULL,
  `lessonnum`    TINYINT UNSIGNED NOT NULL,
  `strokecount`  TINYINT UNSIGNED NOT NULL,
  PRIMARY KEY (`ucs_id`),
  UNIQUE KEY `idx_olded` (`idx_olded`),
  UNIQUE KEY `idx_newed` (`idx_newed`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Either idx_olded or idx_newed is referenced in JOINs. The column itself is optimized as a SMALLINT. However it's not ideal to have an index that covers ~ 2000 of the 20000+ rows in this table.

Still for the time being we could potentially add idx_kkld and idx_rtklite (for example), without impacting performance too much. Each such index will add 2 bytes x 20000 rows. Currently this table is ~700 kb ... compared to the 700 MB stories table it's quite small :blush:

What's neat though, is that a flashcard and story unique identifier is based on user id + UCS code. Hence, if the user switches from one index to another, only the displayed indices are affected. Internally, flashcards and stories references are unaffected, and will be mapped to whatever index is in use.

fabd avatar Mar 03 '17 16:03 fabd