dataprep icon indicating copy to clipboard operation
dataprep copied to clipboard

Text Analysis

Open Waterpine opened this issue 4 years ago • 2 comments

Is your feature request related to a problem? Please describe. The goal of this issue is to enrich the text analysis of dataprep.eda.

Describe the solution you'd like

  1. plot(df, x): method="tfidf" top-k keywords in the column x. Time: 2021/01/19 - 2021/02/28
  2. polt(df, x): method="ngram" top-k n-gram of column x. Time: 2021/01/19 - 2021/02/28
  3. plot(df, x, y): method="pca" Reduce the dimension of the data by using PCA. A scatter plot will be shown. Time: 2021/03/01 - 2021/03/19 All of the above functions will work if the x column contains text data. @jinglinpeng @dovahcrow image

Describe alternatives you've considered N/A

Additional context N/A

Waterpine avatar Jan 18 '21 14:01 Waterpine

a reference from previous issue: https://docs.google.com/document/d/1EQUBEgU_khNl51Z2FPv_4rUGziiZz8fYzXKvqsVXGFU/edit?usp=sharing

jinglinpeng avatar Jan 20 '21 00:01 jinglinpeng

@jinglinpeng n-gram frequency 屏幕快照 2021-02-08 下午11 05 22

Waterpine avatar Feb 08 '21 15:02 Waterpine