reprex icon indicating copy to clipboard operation
reprex copied to clipboard

Capture reprex directly from GitHub issue or SO post

Open krlmlr opened this issue 8 years ago • 15 comments

Intended usage: reprex_invert("rstats-db/DBI#153") .

Slightly related: I wouldn't mind if RStudio opened a new file with the results as a side effect.

krlmlr avatar Jan 19 '17 10:01 krlmlr

That's a nice idea!

jennybc avatar Jan 19 '17 13:01 jennybc

Possible name: reprex_scrape(). Do you agree you are most likely to jump dump the browser URL in as input? E.g.

reprex_scrape("https://github.com/hadley/readxl/issues/102#issue-78459154")

jennybc avatar Jan 20 '17 23:01 jennybc

URL should be fine. I can imagine a use case where I get a GH issue number out of thin air, e.g. a test comment, but then I could also call 102 %>% gh_issue_url %>% reprex_scrape .

krlmlr avatar Jan 23 '17 20:01 krlmlr

+1 for opening a new file with the results

harrismcgehee avatar Jan 04 '18 15:01 harrismcgehee

I'd be happy to take a stab at this issue. I have two questions though:

  • Are you OK with reprex having a dependency on rvest? If not, I'll include it as an optional dependency using the approach described by Hadley here.

  • Regarding the API, I was thinking that reprex_scrape() should have two params - url that specifies where to get the reprex from and file specifying where the scraped reprex should be written to. What do you think? Also, I'm assuming that reprex_scrape() should just write the reprex to a .R file and not actually run it?

crew102 avatar Aug 30 '18 17:08 crew102

I would probably do something like this:

  • Tackle GitHub first.
  • Parse the URL and get relevant identifiers. This would use the completely fictional ghurl package or, in the meantime, borrow useful regex-y code from one of the packages listed in the issues. We have implemented this no fewer than 6 times in other packages 😐
  • Use gh to retrieve the issue comment. I.e. use the API vs actual scraping. Consider putting gh in Suggests.
  • Use as much existing reprex logic and interface as possible to dump it to the usual places (clipboard, outfile, return value).
  • Think about whether it is feasible to auto-apply existing "undo" functions. Or let the user to do that next, maybe with a reminder of which one to use?

jennybc avatar Aug 30 '18 22:08 jennybc

Use as much existing reprex logic and interface as possible to dump it to the usual places (clipboard, outfile, return value).

So just to be clear, reprex_scrape() would be pulling the reprex code from GitHub/SO and running that code as a reprex straight away (i.e., it wouldn't simply write the reprex code to a file)? If that's the case, then I think it would be most natural to just consider GitHub issue/SO post URLs to be a special type of input for reprex() (e.g., reprex(input = "https://github.com/tidyverse/reprex/issues/67")), instead of having a seperate function to render reprexes from these sources.

I like the idea of using gh instead of scraping.

crew102 avatar Aug 31 '18 15:08 crew102

I suspect it might be tricky to isolate the reprex from, say, the entire issue comment and "just" run it. That's why I assumed the first draft of this would dump contents into a file and open it. Seems unavoidable that you'll need reprex_invert() or similar.

That's going to be an intermediate step anyway, so I guess you'll find out 😄

Agree that your suggestion of accepting a URL as input is very slick. I just don't know if it can be made to work.

jennybc avatar Aug 31 '18 15:08 jennybc

#199 has a first pass at implementing reprex_scrap() for GitHub issues. A few comments/questions:

  1. I noticed a teeny issue where reprex_undo() strips sequences of 4 leading spaces from marked up code regardless of the venue, when it should probably only be doing that when the venue is SO:

https://github.com/tidyverse/reprex/blob/0eec4b6a17acd65ff28daa8f64dd749ac6b4050a/R/reprex-undo.R#L180

…This means that some formatting (e.g., indentation of head() below) will be lost when inverting reprexes from non-SO venues:

library(reprex)

reprexed <- reprex({
  library(magrittr)
  mtcars[, 1:2] %>%
      head(n = 1)
})
#> Rendering reprex...
#> Rendered reprex is on the clipboard.
print(reprex_invert(reprexed))
#> Clean code is on the clipboard.
#> [1] "library(magrittr)"                                 
#> [2] "mtcars[, 1:2] %>%"                                 
#> [3] "head(n = 1)"                                       
#> [4] "#' Created on 2018-08-31 by the [reprex"           
#> [5] "#' package](http://reprex.tidyverse.org) (v0.2.0)."

Created on 2018-08-31 by the reprex package (v0.2.0).

Do you want me to open a separate issue for this or just include the fix in #199 ?

  1. That's why I assumed the first draft of this would dump contents into a file and open it.

I think if we’re going to offer to open the file for the user once reprex_scrape() exits, we should have that be an option for reprex_rescue() and reprex_invert() as well. More generally, it may make for a more consistent feel across the package if we use the same logic in these "undo" functions that is used at the end of reprex():

https://github.com/tidyverse/reprex/blob/0eec4b6a17acd65ff28daa8f64dd749ac6b4050a/R/reprex.R#L362-L377

For example, I was wondering why reprex_rescue() didn’t offer me the option to open an output file while reprex() did.

  1. I do plan to add docs for reprex_scrape(), I'd just like to wait until it's reviewed before I do so.

crew102 avatar Aug 31 '18 22:08 crew102

I'm about to leave on a big trip, so I can't review quickly. Feel free to nudge me in a week or so.

I have clean up I want to do re: the undo functions anyway. I'm not surprised you found something that seems a bit odd.

jennybc avatar Aug 31 '18 23:08 jennybc

I have done the cleanup in the undo functions and it did fix the indentation-stripping problem you pointed out (a85849d20298e8524cdca77c26d4f3a56e48d786).

jennybc avatar Sep 11 '18 22:09 jennybc

Re: opening a file with the output. Maybe it does make sense to do this (or offer to do this) more often. Agree the behaviour should be as consistent as possible across all the functions.

In reprex(), I think makes sense to offer when clipboard not available (current behaviour) and, perhaps, whenever the outfile argument is non-NULL (would be new).

In the undo functions, yes perhaps it should just be the default. It is different from reprex(). When you're un-reprexing, you're trying to capture code and get it in front of your eyeballs and fingers. When reprexing, the preview lets you know all is well and it's usually OK to have the result waiting invisibly on the clipboard (if available).

jennybc avatar Sep 11 '18 23:09 jennybc

OK, I'll incorporate the file opening behavior you mentioned into #199 once #204 is addressed

crew102 avatar Sep 13 '18 13:09 crew102

@DavisVaughan basically just 👍 this in weekly meeting

jennybc avatar Feb 08 '21 19:02 jennybc

I am guessing the idea is to do something like my reprex below? (This could be included in a github action (or something similar) on issues to perform automatic triage of reprexes that actual work or not, to put a label on them)

url <- "https://github.com/tidyverse/dplyr/issues/5849"

issue_json <- gh::gh(sub("https*://github.com", "/repos", url))
  
issue_text <- unlist(strsplit(issue_json$body, "\r\n"))
  
(reprex::reprex_invert(input = issue_text))
#> CLIPR_ALLOW has not been set, so clipr will not run interactively
#>  [1] "#' This works and gives the expected result:"                                                     
#>  [2] ""                                                                                                 
#>  [3] "my_summarise5 <- function(data, mean_var ) {"                                                     
#>  [4] "  data %>% "                                                                                      
#>  [5] "    mutate(\"mean_{{mean_var}}\" := mean({{ mean_var }}),"                                        
#>  [6] "    across(last_col(), ~.+1, .names = \"{col}_plusone\"))"                                        
#>  [7] "}"                                                                                                
#>  [8] ""                                                                                                 
#>  [9] "mtcars %>% my_summarise5(cyl) %>% head"                                                           
#> [10] ""                                                                                                 
#> [11] "#' giving:"                                                                                       
#> [12] ""                                                                                                 
#> [13] "                   mpg cyl disp  hp drat    wt  qsec vs am gear carb mean_cyl mean_cyl_plusone"   
#> [14] "Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   6.1875           7.1875"   
#> [15] "Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   6.1875           7.1875"   
#> [16] "Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   6.1875           7.1875"   
#> [17] "Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   6.1875           7.1875"   
#> [18] "Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   6.1875           7.1875"   
#> [19] "Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   6.1875       "             
#> [20] ""                                                                                                 
#> [21] "#' but it would have been nicer to be able to be able to refer to the columns directly like this:"
#> [22] ""                                                                                                 
#> [23] "my_summarise5 <- function(data, mean_var ) {"                                                     
#> [24] "  data %>% "                                                                                      
#> [25] "    mutate(\"mean_{{mean_var}}\" := mean({{ mean_var }}),"                                        
#> [26] "    across(\"mean_\"{{mean_var}}\", ~.+1, .names = \"mean_{{mean_var}}_plusone\"))"               
#> [27] "}"                                                                                                
#> [28] ""                                                                                                 
#> [29] "#' or "                                                                                           
#> [30] ""                                                                                                 
#> [31] "my_summarise5 <- function(data, mean_var ) {"                                                     
#> [32] "  data %>% "                                                                                      
#> [33] "    mutate(\"mean_{{mean_var}}\" := mean({{ mean_var }}),"                                        
#> [34] "     \"mean_{{mean_var}}_plusone\" :=  across(\"mean_{{mean_var}}) + 1)"                          
#> [35] "}"                                                                                                
#> [36] ""                                                                                                 
#> [37] "#' neither of which work."

mcanouil avatar Apr 11 '21 14:04 mcanouil