Yuvi
Yuvi
@ryanlovett it pulls in all imports, transitive or otherwise. @balajialg Are you talking about 'installation' which happens only once in the image when we build it, or 'use' which is...
A pruning process would look like: 1. Look at https://github.com/berkeley-dsep-infra/datahub/blob/staging/deployments/datahub/images/default/requirements.txt 2. Consider bunches of packages installed for specific classes 3. Investigate if they have been used *at all*. non-zero use...
> Given the context that almost all the installed packages were used at least once in the past 6 months across all the hubs I don't think this is true...
Basically, for python packages, we should install them with conda via environment.yml if it exists in conda-forge, and use pip otherwise.
@agoose77 Most of the R community I know of would like to use `install.package` or `devtools` to install packages and manage them from CRAN, and I don't want to redirect...
@agoose77 most of our R users use R via RStudio, so conda and Jupyter kernels are completely uninvolved there.
> Both Python and R are installed in the same environment Ah, so they're installed in the same Docker image, but R doesn't know anything about conda at all, so...
@agoose77 ah, ok - I'll consider that :) I'm somewhat quite reluctant to use conda for R, as I feel the general R community is much more focused on CRAN...
I agree that removing libraries would be killer. I just discovered for example that tensorflow doesn't even import on latest datahub, and could've been removed as nobody has so far...
@balajialg great! For a february launch, I think the work should go into modifying https://github.com/berkeley-dsep-infra/datahub-homepage to fit the use case (there are docs on how to test it locally there)....