suggestomatic
suggestomatic copied to clipboard
Anonymized real world data set
The test data is great, but makes it hard to test performance at the scale Suggestomatic was intended for. Internally at Causes we have a test set of about 900m records, we should obfuscate group and user ids and release the large scale data set.
Anonymizing data is hard. It's likely easier to just generate a larger test set with similar distributions.
The Comment & Close button is far to clickable.
Generating a large test set with similar distributions is also hard :)
Looking forward to chewing on it!