knitr icon indicating copy to clipboard operation
knitr copied to clipboard

Make stringr / stringi a soft dependency

Open HughParsonage opened this issue 7 years ago • 5 comments
trafficstars

knitr's dependency graph includes evaluate and stringr, and thus stringi. Package stringi is a heavy dependency: it takes around a minute to install from source (or several on OSX or Windows) -- a typical use-case for things like Travis-CI. Since knitr (to its immense credit) is required for testing by an enormous number of packages, shaving a minute from each test could have a large payoff. (I also use Travis-CI to deploy knitr+LaTeX documents and installing stringi is a significant component of the build time.)

I'd propose to move stringr to Suggests both here and in evaluate. I'd keep the same verbs as stringr uses, but switch to base R if stringi is not installed. See https://github.com/HughParsonage/TeXCheckR/blob/master/R/suggested_speedups.R for an example of softening the dependencies for readr and stringi.

A glance at the parts of the source currently importing stringr makes me think this would not be too hard. I'd be happy to do the work, but your contribution guidelines say to raise an issue before starting work on something like this.

HughParsonage avatar May 24 '18 00:05 HughParsonage

Thanks for asking first! I'd love to get rid of the stringr/stringi dependencies if possible. Installing stringi also bothers myself a bit. Please feel free to work on it and I'll be happy to review the PR. Thanks!

yihui avatar May 24 '18 00:05 yihui

Hi guys, any progress on that issue? knitr is so crucial package that it should be as lightweight as possible. stringr is definitely first candidate to get rid off from mandatory deps.

jangorecki avatar Dec 01 '20 08:12 jangorecki

@jangorecki Sorry, but no yet. As I said above, I'd love to get rid of the stringr dependency. @HughParsonage has kindly contributed a big PR #1549 that I haven't had time to digest yet. I feel it may be easier to achieve the goal if we replace one stringr function at one step (i.e., replace each function in a separate PR). We could start with the easier ones, and decide what to do with the hard bones later.

yihui avatar Jan 22 '21 20:01 yihui

For later reference, knitr is currently using 13 functions for stringr that would need to be rewrite without stringr.

library(dplyr)
itdepends::dep_usage_pkg("knitr") %>% filter(pkg == "stringr") %>% distinct()
#> Warning: `as.tibble()` is deprecated as of tibble 2.0.0.
#> Please use `as_tibble()` instead.
#> The signature and semantics have changed, see `?as_tibble`.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_warnings()` to see where this warning was generated.
#> # A tibble: 13 x 2
#>    pkg     fun            
#>    <chr>   <chr>          
#>  1 stringr str_detect     
#>  2 stringr str_count      
#>  3 stringr str_locate     
#>  4 stringr str_sub        
#>  5 stringr str_locate_all 
#>  6 stringr str_extract_all
#>  7 stringr str_wrap       
#>  8 stringr str_trim       
#>  9 stringr str_pad        
#> 10 stringr str_dup        
#> 11 stringr str_split      
#> 12 stringr str_match_all  
#> 13 stringr str_replace_all

One thing to watch out for is the performance if we remove stringi as we don't want to lose performance.

cderv avatar Jan 26 '21 15:01 cderv

For reference, this could be of interest https://github.com/hadley/stringb

cderv avatar Apr 20 '21 22:04 cderv

We have finally finished removing stringr from the dependency. We will do more testing and also benchmarking later. Everyone is welcome to help us test and benchmark the dev version! Thanks!

P.S. The benchmark should be done against knitr v1.40 but we did the stringr removal in several commits since September (9c92eff1f22b1e6c3bf5dc538dd8c0a2edc8ad48 8fa7d1710bae72f9636ef64fa9286669329f1a9c 1ce82862b689338ce275269056ebfcc96eb975ca 67b973dc87c9d479cd5fbd76806e3a91f5e7d18c 5a2cb72f63aa64b757418df99fc2ad3ef5292ac0 1a0f2cc269c6a7f86f10b77d12229a80c14d6ee6 cc3b92a9f1c591a3424383e691349fdb4588d328 b1ce0c7ae38f2539980ae72adabe7cdba4393b3a), so benchmarking can be tricky.

yihui avatar Dec 22 '22 23:12 yihui

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

github-actions[bot] avatar Jul 05 '23 05:07 github-actions[bot]