smltar
smltar copied to clipboard
Find source to the creation of the SMART stopword list
It appears that the word list is machine generated and I would like a confirmation on that.
Currently using:
@article{Lewis2014,
author = {Lewis, David D. and Yang, Yiming and Rose, Tony G. and Li, Fan},
title = {RCV1: A New Benchmark Collection for Text Categorization Research},
journal = {J. Mach. Learn. Res.},
issue_date = {12/1/2004},
volume = {5},
month = dec,
year = {2004},
issn = {1532-4435},
pages = {361--397},
numpages = {37},
url = {http://dl.acm.org/citation.cfm?id=1005332.1005345},
acmid = {1005345},
publisher = {JMLR.org},
}
Labeling this issue as "Stage 0: Now", but I have a feeling this piece of information might be lost to history/internal documents.
Gave a public call for help at https://twitter.com/Emil_Hvitfeldt/status/1365466442863308801