reprex
reprex copied to clipboard
Capture reprex directly from GitHub issue or SO post
Intended usage: reprex_invert("rstats-db/DBI#153")
.
Slightly related: I wouldn't mind if RStudio opened a new file with the results as a side effect.
That's a nice idea!
Possible name: reprex_scrape()
. Do you agree you are most likely to jump dump the browser URL in as input? E.g.
reprex_scrape("https://github.com/hadley/readxl/issues/102#issue-78459154")
URL should be fine. I can imagine a use case where I get a GH issue number out of thin air, e.g. a test comment, but then I could also call 102 %>% gh_issue_url %>% reprex_scrape
.
+1 for opening a new file with the results
I'd be happy to take a stab at this issue. I have two questions though:
-
Are you OK with
reprex
having a dependency onrvest
? If not, I'll include it as an optional dependency using the approach described by Hadley here. -
Regarding the API, I was thinking that
reprex_scrape()
should have two params -url
that specifies where to get the reprex from andfile
specifying where the scraped reprex should be written to. What do you think? Also, I'm assuming thatreprex_scrape()
should just write the reprex to a .R file and not actually run it?
I would probably do something like this:
- Tackle GitHub first.
- Parse the URL and get relevant identifiers. This would use the completely fictional ghurl package or, in the meantime, borrow useful regex-y code from one of the packages listed in the issues. We have implemented this no fewer than 6 times in other packages 😐
- Use gh to retrieve the issue comment. I.e. use the API vs actual scraping. Consider putting gh in Suggests.
- Use as much existing reprex logic and interface as possible to dump it to the usual places (clipboard, outfile, return value).
- Think about whether it is feasible to auto-apply existing "undo" functions. Or let the user to do that next, maybe with a reminder of which one to use?
Use as much existing reprex logic and interface as possible to dump it to the usual places (clipboard, outfile, return value).
So just to be clear, reprex_scrape()
would be pulling the reprex code from GitHub/SO and running that code as a reprex straight away (i.e., it wouldn't simply write the reprex code to a file)? If that's the case, then I think it would be most natural to just consider GitHub issue/SO post URLs to be a special type of input for reprex()
(e.g., reprex(input = "https://github.com/tidyverse/reprex/issues/67")
), instead of having a seperate function to render reprexes from these sources.
I like the idea of using gh
instead of scraping.
I suspect it might be tricky to isolate the reprex from, say, the entire issue comment and "just" run it. That's why I assumed the first draft of this would dump contents into a file and open it. Seems unavoidable that you'll need reprex_invert()
or similar.
That's going to be an intermediate step anyway, so I guess you'll find out 😄
Agree that your suggestion of accepting a URL as input
is very slick. I just don't know if it can be made to work.
#199 has a first pass at implementing reprex_scrap()
for GitHub issues. A few comments/questions:
- I noticed a teeny issue where
reprex_undo()
strips sequences of 4 leading spaces from marked up code regardless of the venue, when it should probably only be doing that when the venue is SO:
https://github.com/tidyverse/reprex/blob/0eec4b6a17acd65ff28daa8f64dd749ac6b4050a/R/reprex-undo.R#L180
…This means that some formatting (e.g., indentation of head()
below) will be lost when inverting reprexes from non-SO venues:
library(reprex)
reprexed <- reprex({
library(magrittr)
mtcars[, 1:2] %>%
head(n = 1)
})
#> Rendering reprex...
#> Rendered reprex is on the clipboard.
print(reprex_invert(reprexed))
#> Clean code is on the clipboard.
#> [1] "library(magrittr)"
#> [2] "mtcars[, 1:2] %>%"
#> [3] "head(n = 1)"
#> [4] "#' Created on 2018-08-31 by the [reprex"
#> [5] "#' package](http://reprex.tidyverse.org) (v0.2.0)."
Created on 2018-08-31 by the reprex package (v0.2.0).
Do you want me to open a separate issue for this or just include the fix in #199 ?
-
That's why I assumed the first draft of this would dump contents into a file and open it.
I think if we’re going to offer to open the file for the user once reprex_scrape()
exits, we should have that be an option for reprex_rescue()
and reprex_invert()
as well. More generally, it may make for a more consistent feel across the package if we use the same logic in these "undo" functions that is used at the end of reprex()
:
https://github.com/tidyverse/reprex/blob/0eec4b6a17acd65ff28daa8f64dd749ac6b4050a/R/reprex.R#L362-L377
For example, I was wondering why reprex_rescue()
didn’t offer me the option to open an output file while reprex()
did.
- I do plan to add docs for
reprex_scrape()
, I'd just like to wait until it's reviewed before I do so.
I'm about to leave on a big trip, so I can't review quickly. Feel free to nudge me in a week or so.
I have clean up I want to do re: the undo functions anyway. I'm not surprised you found something that seems a bit odd.
I have done the cleanup in the undo functions and it did fix the indentation-stripping problem you pointed out (a85849d20298e8524cdca77c26d4f3a56e48d786).
Re: opening a file with the output. Maybe it does make sense to do this (or offer to do this) more often. Agree the behaviour should be as consistent as possible across all the functions.
In reprex()
, I think makes sense to offer when clipboard not available (current behaviour) and, perhaps, whenever the outfile
argument is non-NULL
(would be new).
In the undo functions, yes perhaps it should just be the default. It is different from reprex()
. When you're un-reprexing, you're trying to capture code and get it in front of your eyeballs and fingers. When reprexing, the preview lets you know all is well and it's usually OK to have the result waiting invisibly on the clipboard (if available).
OK, I'll incorporate the file opening behavior you mentioned into #199 once #204 is addressed
@DavisVaughan basically just 👍 this in weekly meeting
I am guessing the idea is to do something like my reprex below? (This could be included in a github action (or something similar) on issues to perform automatic triage of reprexes that actual work or not, to put a label on them)
url <- "https://github.com/tidyverse/dplyr/issues/5849"
issue_json <- gh::gh(sub("https*://github.com", "/repos", url))
issue_text <- unlist(strsplit(issue_json$body, "\r\n"))
(reprex::reprex_invert(input = issue_text))
#> CLIPR_ALLOW has not been set, so clipr will not run interactively
#> [1] "#' This works and gives the expected result:"
#> [2] ""
#> [3] "my_summarise5 <- function(data, mean_var ) {"
#> [4] " data %>% "
#> [5] " mutate(\"mean_{{mean_var}}\" := mean({{ mean_var }}),"
#> [6] " across(last_col(), ~.+1, .names = \"{col}_plusone\"))"
#> [7] "}"
#> [8] ""
#> [9] "mtcars %>% my_summarise5(cyl) %>% head"
#> [10] ""
#> [11] "#' giving:"
#> [12] ""
#> [13] " mpg cyl disp hp drat wt qsec vs am gear carb mean_cyl mean_cyl_plusone"
#> [14] "Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 6.1875 7.1875"
#> [15] "Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 6.1875 7.1875"
#> [16] "Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 6.1875 7.1875"
#> [17] "Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 6.1875 7.1875"
#> [18] "Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 6.1875 7.1875"
#> [19] "Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 6.1875 "
#> [20] ""
#> [21] "#' but it would have been nicer to be able to be able to refer to the columns directly like this:"
#> [22] ""
#> [23] "my_summarise5 <- function(data, mean_var ) {"
#> [24] " data %>% "
#> [25] " mutate(\"mean_{{mean_var}}\" := mean({{ mean_var }}),"
#> [26] " across(\"mean_\"{{mean_var}}\", ~.+1, .names = \"mean_{{mean_var}}_plusone\"))"
#> [27] "}"
#> [28] ""
#> [29] "#' or "
#> [30] ""
#> [31] "my_summarise5 <- function(data, mean_var ) {"
#> [32] " data %>% "
#> [33] " mutate(\"mean_{{mean_var}}\" := mean({{ mean_var }}),"
#> [34] " \"mean_{{mean_var}}_plusone\" := across(\"mean_{{mean_var}}) + 1)"
#> [35] "}"
#> [36] ""
#> [37] "#' neither of which work."