scan icon indicating copy to clipboard operation
scan copied to clipboard

questions from vectorizer.py

Open hustmonk opened this issue 10 years ago • 1 comments

It's a good job.

When I read the code, I find some questions as follows:

(1) self.initial_vectorizer = CountVectorizer(ngram_range=(1, 2), min_df = 3 / len(input_text), max_df=.4) need to change 3 to 3., or min_df = 0

(2) For every term, you compute fisher_exact, and choose the terms with the higher pval. maybe you should choose the lower pval.

hustmonk avatar Jul 02 '14 09:07 hustmonk

Sorry for taking so long to reply! Must have missed this issue.

You're right on both counts -- thanks for noticing. I'll fix soon.

VikParuchuri avatar Apr 10 '15 23:04 VikParuchuri