mark padgham comments

Results 619 comments of


                                            mark padgham

overlap with sbmR package?

Coincidence: https://twitter.com/bikesRdata/status/1232383846689247232. And yeah, the portability for web assembly is a very strong argument for your way of doing things.

Text search complexity vs dependencies

Yeah, that is definitely something I'm aware of, and that would be one workable solution. The big advantage of "proper" text processing is (in this context) really just the stemming,...

Text search complexity vs dependencies

I've been pondering the scale of this issue. In short: It needs another package because there's been so much amazing development on [text analysis in R](https://www.tidytextmining.com/tidytext.html) that has ultimately led...

Text search complexity vs dependencies

See also ROpenSci's own [`tokenizers` package](https://github.com/ropensci/tokenizers), which uses the [`snowballC` package](https://cran.r-project.org/web/packages/SnowballC/index.html) for the hard work.

Text search complexity vs dependencies

@jonocarroll thoughts here please. This code tokenizes the Description texts of all R packages using 3 different packages for the task:: ``` db

Text search complexity vs dependencies

Yeah, i agree, and actually realised that `flipper` could simply pre-store/cache the tokenized versions of package descriptions anyway, entirely avoiding the speed issue. I'll sketch out a `tokenizers` solution here...

Custom class (?)

yeah, that's a solid idea, especially for the ease of a `print` method

it works!

The only extra info that is directly available via GH API v4 that is likely useful here would be [`labels`](https://developer.github.com/v4/object/labelconnection/), but it would of course also be easy to trawl...

it works!

oh no, whoops, the `labels` i flagged are actually just the issue labels. What i meant was of course and indeed [`repositoryTopics`](https://developer.github.com/v4/object/repositorytopic/).

PAT details

Thanks for checking all that out - that naming as `GITHUB_GRAPHQL_TOKEN` was just copied directly from [`ghrecipes`](https://github.com/ropenscilabs/ghrecipes/blob/master/R/zzz.R#L24). I'm not (yet) sure of the details of what is necessary in terms...