clustRviz icon indicating copy to clipboard operation
clustRviz copied to clipboard

Code to Create `presidential_speech` data set

Open michaelweylandt opened this issue 7 years ago • 3 comments

Add code to (re)create presidential_speech data set. This will allow the data set to be re-run as more speeches are recorded. (If we update the "official" version in the package, we may need to version it somehow.)

michaelweylandt avatar Sep 15 '18 17:09 michaelweylandt

Do we have a preference for where this should be in the project directory structure?

The code has an R script (used for text processing), but also uses python for web scraping, and some bash scripts for moving things around. In total, it's really a small set of directories.

Do we want to add as a .zip for example, or copy over the whole directory?

jjn13 avatar Sep 17 '18 02:09 jjn13

By default, I'd put it in inst (i.e., random files that get installed but not used by the package), but I think that the data-raw directory is also a common choice. See http://r-pkgs.had.co.nz/data.html#data-extdata

michaelweylandt avatar Sep 17 '18 02:09 michaelweylandt

sounds good, I've submitted as https://github.com/DataSlingers/clustRviz/pull/40

jjn13 avatar Sep 17 '18 03:09 jjn13