Sean Massung

Results 10 issues of Sean Massung

This is the first issue. I've labeled it as a comment by selecting the "Labels" button on the right side of this text box. Issues are cool because you can...

Comment

Re: discussion in #150 -- right now feature selection is classification-centric.

enhancement
feature-request

Create a filter that replaces all numbers with the same token. This is not an alpha filter; we want to represent that numbers occurred, while collapsing them all into one...

Parse a list of TREC files with multiple tags per file, etc. Could support .gz TREC files.

(as referenced in #107)

enhancement

`language_model` needs the ability to estimate from a corpus instead of requiring a .arpa file

enhancement
feature-request

Use the output of a chunker to create features based on strings of words. This will be particularly useful when combined with the topic modeling algorithms to create phrase-based models.

enhancement

feature-request