Dmitry Shpika

Results 19 comments of Dmitry Shpika

Also, I thought it would be nice to add an optional comment to each romaji+kana record. For example, when user enters a phrase starting with 「私は…」 when it's not necessary,...

A video is the best option for not so technical people. I've read through [AKASHA Knowledge Base](http://akasha.helpscoutdocs.com/) and still have some questions: 1. How do I get ETH? Do I...

I have the same problem all the time. When I open the IPFS log, I see lots of these: 15:45:17.681 ERROR  floodsub: error reading rpc from : connection reset...

Hello @rbleuse Yes, that's possible, but quite a lot of work because of the sheer size of the multilingual JMdict file. (Languages other than English don't have their separate files.)...

Problems: - :warning: Existing libs for JSON Schema don't work with large files - [everit-org/json-schema](https://github.com/everit-org/json-schema) provide unhelpful validation messages because no source file location (line number) is given in error...

@fasiha Thank you! Last few months were a bit tough, but I'm doing okay. I'll get back to the project soon, when I'll have some spare weekends and energy.

Thank you for the link, it's very interesting. There are biases in both methods, and probably in every method. Specifically, if we use the proposed formula, we ignore the size...

Your example with the "哈" character makes a lot of sense. And actually, it reminds me about one more known issue, that in the Twitter dataset we see characters used...

You can see an example of a processing method I used here: https://github.com/scriptin/twitter-kanji-frequency/blob/master/collect-data.js Basically, I did `text.replace(/[^\u4e00-\u9fff]+/g, '')` to get rid of everything except for desired characters. (Note the RegExp...

I will include the following Unicode blocks in the next version: **Basic** datasets versions: - [CJK Unified Ideographs](https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block)) - in the current version, that is the only Unicode block included...