wana_kana_rust icon indicating copy to clipboard operation
wana_kana_rust copied to clipboard

Accept char iterator as input

Open tmfink opened this issue 5 years ago • 4 comments

Work-in-progress to handle #3.

Please let me know your thoughts on this style. I just worked on katakana_to_hiragana(). Once we figure out how the best way to do this, I will work on the other modules.

Add iterator input APIs to

  • is_hiragana
    • [ ] is_hiragana()
  • is_japanese
    • [ ] is_japanese()
  • is_kana
    • [ ] is_kana()
  • is_kanji
    • [ ] contains_kanji()
    • [ ] is_kanji()
  • is_katakana
    • [ ] is_katakana()
  • is_mixed
    • [ ] is_mixed()
    • [ ] is_mixed_pass_kanji()
  • is_romaji
    • [ ] is_romaji()
  • to_hiragana
    • [ ] to_hiragana()
    • [ ] to_hiragana_with_opt()
  • to_kana
    • [ ] to_kana()
    • [ ] to_kana_with_opt()
  • to_katakana
    • [ ] to_katakana()
    • [ ] to_katakana_with_opt()
  • to_romaji
    • [ ] to_romaji()
    • [ ] to_romaji_with_opt()
  • tokenize
    • [ ] tokenize()
    • [ ] tokenize_detailed()
    • [ ] tokenize_with_opt()
  • trim_okurigana
    • [ ] is_invalid_matcher()
    • [ ] is_leading_without_initial_kana()
    • [ ] is_trailing_without_final_kana()
    • [ ] trim_okurigana()
    • [ ] trim_okurigana_with_opt()
  • utils
    • [ ] hiragana_to_katakana()
    • [X] katakana_to_hiragana()
    • [ ] romaji_to_hiragana()

tmfink avatar Sep 16 '20 15:09 tmfink

There is a place where the next char is peeked: https://github.com/PSeitz/wana_kana_rust/blob/c45ebcdf3e8acc6421e967e792374ac390947df8/src/to_hiragana.rs#L29-L31

So it would require a peekable iterator I think https://doc.rust-lang.org/std/iter/struct.Peekable.html

There may be other similar cases.

PSeitz avatar Sep 16 '20 15:09 PSeitz

On second thought, we could always just use the minimal api instead of a uniform one, which would not require Peekable here. What do you think?

PSeitz avatar Sep 30 '20 15:09 PSeitz

On second thought, we could always just use the minimal api instead of a uniform one, which would not require Peekable here. What do you think?

@PSeitz What do you mean by minimal vs. uniform API?

In some cases, we can get away with avoiding the "direct indexing" approach. For katakana_iter_to_hiragana_with_opt(), I added a previous_char variable to track the previous character (instead of indexing into the previous index).

tmfink avatar Oct 01 '20 04:10 tmfink

@PSeitz What do you mean by minimal vs. uniform API?

I mean defining a single iterator api which is used by all methods, vs minimal api everywhere.

PSeitz avatar Oct 01 '20 06:10 PSeitz