smltar icon indicating copy to clipboard operation
smltar copied to clipboard

Find source to the creation of the SMART stopword list

Open EmilHvitfeldt opened this issue 4 years ago • 2 comments

It appears that the word list is machine generated and I would like a confirmation on that.

EmilHvitfeldt avatar Oct 13 '19 18:10 EmilHvitfeldt

Currently using:

@article{Lewis2014,
 author = {Lewis, David D. and Yang, Yiming and Rose, Tony G. and Li, Fan},
 title = {RCV1: A New Benchmark Collection for Text Categorization Research},
 journal = {J. Mach. Learn. Res.},
 issue_date = {12/1/2004},
 volume = {5},
 month = dec,
 year = {2004},
 issn = {1532-4435},
 pages = {361--397},
 numpages = {37},
 url = {http://dl.acm.org/citation.cfm?id=1005332.1005345},
 acmid = {1005345},
 publisher = {JMLR.org},
}

juliasilge avatar Jul 16 '20 21:07 juliasilge

Labeling this issue as "Stage 0: Now", but I have a feeling this piece of information might be lost to history/internal documents.

Gave a public call for help at https://twitter.com/Emil_Hvitfeldt/status/1365466442863308801

EmilHvitfeldt avatar Feb 27 '21 01:02 EmilHvitfeldt