HIGUCHI Koichi / 樋口耕一 comments

Results 22 comments of


                                            HIGUCHI Koichi / 樋口耕一

ベイズ学習の運用時、モデル更新の必要可否判断をするのに必要な教師ラベル付きデータのサンプル数

こんにちは、開発者の樋口です。書き込みありがとうございます。単に数の問題ではなく、新規データがこれまでのデータと同じ内容であれば古い教師データでそのまま分類できるけど、内容が変化していれば教師データも更新しなければいけないということではないでしょうか。そうだとすれば、データが変化しているかどうかを、ベイズ学習以外の分析法も使って調べた方が良いかもしれません。どの程度の量・割合で、教師データを作るかということについては私も明るくありませんので、論文・文献をお探しいただ串かないと思います。良いものがあったらぜひここでお教え下さい。

Treemap for cumulatively sentence/paragraph/(sub)chapter counting

Hi, KH Coder does not have the treemap functionality. I think you can output tables from KH Coder, then use other apps to create the treemap with the tables. Actually,...

Treemap for cumulatively sentence/paragraph/(sub)chapter counting

Nice! Here are some SQL examples: sentences per chapter: ```SQL SELECT h1_id, COUNT(*) as sentences FROM bun GROUP BY h1_id ``` sentences per sub chapter: ```SQL SELECT h1_id, h2_id, COUNT(*)...

Treemap for cumulatively sentence/paragraph/(sub)chapter counting

Very nice! Many thanks for sharing with us. (Didn't know Excel can create treemaps)

new manual

Yes, we have to update the manual. But right now, I can't say when. Sorry for the inconvenience. Please post questions here when you find the outdated section of the...

new manual

There is no built-in function to calculate the TF-IDF. So you have to manually calculate it. Go to "Project", "Export", "Word Frequency List (for Excel)" in the menu. Then you...

new manual

It uses "LDA" function of ["topicmodels" package](https://cran.r-project.org/web/packages/topicmodels/index.html). [Vignettes](https://cran.r-project.org/web/packages/topicmodels/vignettes/topicmodels.pdf) and the [reference manual](https://cran.r-project.org/web/packages/topicmodels/topicmodels.pdf) of the package may help. Most options remain default except for using gibbs sampling and setting the seed...

Spanish translation of interface messages

Thank you for providing Spanish translations! If you make a "pull request", your account name will be shown in [Contributors](https://github.com/ko-ichi-h/khcoder/graphs/contributors) page. If I just copy your translations into the source...

Spanish translation of interface messages

No, unfortunately there is no Spanish manual.

対応分析と外部変数と色分け

こんにちは、開発者の樋口です。書き込みありがとうございます。 > 対応分析で外部変数を利用し，「書き手」で描画すると，2群に分かれて描画されるのが現在の仕様だと思います。「2群に分かれる」という意味を、はかりかねております。2つ群の中間あたりにも、語がプロットされると思うのですが、それではいけないでしょうか。もし差し支えございませんでしたら、具体的な分析結果を貼り付けていただければと存じます。 > ですが，NN群の作文によってはネイティブよりの作文もあったりするため，2群に分けてしまうと，各データの分布を見ることができません。そこで，外部変数なしの対応分析のように描画しつつ，外部変数で色分けするようなオプションがあると，作文ごとの分布を見ることができて良いと思っているのですが，いかがでしょうか。現在のKH Coderは【語】をプロットしていますが、おっしゃっているのは、1つ1つの【作文】を、1つ1つのドットとしてプロットされたいということでしょうか。個々の作文なら、たしかに外部変数で簡単に色分けできます。作文を色分けすれば、ネイティブの作文がかたまっている部分と、ノンネイティブの作文がかたまったている部分と、両方が入り交じっている部分が、分析結果上にあらわれるのではないか…というようなことでしょうか？