Scoop
Scoop copied to clipboard
Discussion: Persistent data, update improvements
I want to start discussion about enhancing the update functionality solving multiple issues
Persistent data
- Folders per Version #126
- PHP configuration directory along versions leads to problems #1185
- I think npm global packages should be installed in it's own directory #1220
Multiple solutions were suggested for this. Travelling directories moves data between versions, already partially implemented (not working).
A backpack
/data
/conf
directory alongside the apps
directory.
Travelling directories should be easy to implement but has very big problem. When an update fails you are required to uninstall the app and install it again, if that happens you will lose your data. (See last section)
Using a custom data directory may not be possible for all apps, the app must support changing the default directories. Could that work with conjunctions/symlinks?
We may have to implement both. Any comments or ideas?
TODO
- [x] Implement
persist
- [x] Update
schema.json
- [x] Update manifest documentation
- [x] Make data directory available as variable (
post_install
,env_set
etc.)- [x] Installation
- [x] Update, does not share the same variables. Everything has to be done in two places, I try to merge them (install vs. update)
- [x] Wait for feedback on f2559938024e514489aec10f495b651b34b96c81
- [x] Test install/updates/global
- [x] Implement flag on uninstall to purge all persisted data TODO: documentation
- [x] Remove
travel_dir
partial implementation
To consider
- [ ] Persist data from old version on update with adding persist data
- [x] Migrate old php config directory
Affected apps: I would like to create a list of affected apps in this issue
- [x] mysql, database content
- [x] mariadb, database content
- [x] nodejs, globally installed packages
- [x] nginx, configuration, html files and logs
- [x] ruby, gems
- [x] python, pip packages
- [x] php, configuration
- [ ] rethinkdb, database content
- [ ] couchdb, database content
- this list is incomplete
Feature branch https://github.com/rrelmy/scoop/tree/persist
Development/testing problematic
Testing those update paths is currently not easy.
Installing updated manifests with scoop install bucket/app.json
does not take the same code path as scoop update app
.
Using scoop update bucket/app.json
does not work.
Prevent such updates through scoop install
and make them work with scoop update
should be pretty straightforward.
What do you think about that?
Resolved
Just us ./bin/scoop.sh1 update <app>
Update failuers
I never liked the fact that failures during updates force you to uninstall the app and reinstall it completely. While generally looking at the update procedure I would like to check if there are some possible improvements.
For example try to download the update before uninstalling the current one would catch some basic errors. Or if possible remove the shims just before the new ones are set.
Something for the List
- nginx, configuration
Testing updates is quite easy, if they are in the main bucket.
Just make your changes and run .\bin\scoop.ps1 install|update <updatedappname>
from the repo directory.
If the changes are made in the extras bucket, you can add your local repository as a bucket with scoop bucket add X:\scoop-extras extras-testing
Some programs provide a default configuration (php, nginx) which should be copied to the conf
directory (or whatever it's called later) on installation. My idea was to copy these configuration files to the conf
directory and ask if they should be overwritten, left like they are, or create new files with a suffix like dpkg
does it on linux (.dpkg-dist
, .dpkg-old
).
I have to clarifya bit more, implementing travelling dirs would take place in the scoop update command. But testing modified manifests would take the install path, ignoring the already installed version. Or am I missing something?
Directory junctions would work well for persistent data directories. This way apps don't have to be configured to use a special directory, they can just use their default.
If we have a JSON manifest for NodeJS, e.g.
{
...
"persist": "nodejs/node_modules"
...
}
Then on install, we could copy anything in $installdir\nodejs\node_modules
to $scoopdir\persist\nodejs\7.7.1\nodejs\node_modules
, and replace $installdir\nodejs\node_modules
with a directory junction to $scoopdir\persist\nodejs\7.7.1\nodejs\node_modules
.
So the update steps would be:
- Create the new "persist" directory at
$scoopdir\persist\nodejs\$version\nodejs\node_modules
. - If the installed directory (
$installdir\nodejs\node_modules
) isn't empty (does NodeJS come with npm in node_modules?), copy it to the new persist directory. - If there is a "persist" directory from the previous version with the same name, move the contents of the old "persist" directory to the new "persist" directory, excluding any files or directories that exist in the new directory. It looks like
robocopy
might be able to do this with/MOVE /XN
, which would be much quicker than doing it by examining directory contents with PowerShell. We would just need to be sure that it stops at a directory that already exists, and doesn't copy inner files that don't exist. - Create the directory junction
So the old "persist" directory would end up with only the files that have been replaced in the new "persist" directory, and the new "persist" directory would have all the old contents, plus anything that came with the new install.
To make it resilient to failure, we could allow for reversing this process by doing the reverse robocopy
, deleting the new "persist" directory, and updating the directory junction.
For testing updates, does scoop update path\to\app.json
work? There is a call to locate in install_app
that should get work out that this is updating from the local path.
.\bin\scoop.ps1 update <app>
from the repo also works.
@lukesampson I like your persist idea. It would be nice if it could be a file as well as a directory. So something like this would work:
{
// ...
"persist": "php.ini"
// ...
}
And it should also support multiple files/directories through an array
Agreed @deevus 👍
You could hash persist
files to see if they've been changed. So for example:
- I run
scoop install php
-
php.ini
is copied topath/to/persist/php.ini
- We also store an md5 of the file in
path/to/persist/.hashes/php.ini.md5
Then if I run scoop update php
- We generate a hash of the new
php.ini
file - Comparing it to
path/to/persist/.hashes/php.ini.md5
, we can determine if- The user has changed the persisted file and the new file has been updated (show a warning)
- The user has not changed the persisted file and the new file has been updated (overwrite the file)
- The new file has not changed (do nothing)
Going even further
You could even go as far as storing the original file so that if/when the user updates a persisted file, we can compare the original vs new and show a diff.
That would be very nice to have, @deevus. Could we use git diff --no-index path/to/persist/php.ini path/to/install/php.ini
to diff?
Thanks for the inputs
Persistent data
Using conjunctions sounds great. I tested it with the mariadb database content and it works great!
There is a config option NO_JUNCTIONS
, it was implemented because of #1202
I am not sure if this is a general problem we will run into with more apps.
What should if someone activated this option? Should we ignore it?
As persist
reference I would suggest following possibilities
{
"persist": "nodejs/node_modules", # just one directory
"persist": "some_file.ini", # a files
"persist": ["some_dir", "some_other"], # multiple things
"persist": [ # custom name (like with `bin`)
["nodejs/node_modules", "customname"],
"other"
]
}
We have already some code for travelling directories, can I remove it?
Updating copied files
Nice idea, I will keep that in mind but will not focus on that for the first version. This should be configurable per directory/file. We could ask the user if he wants to overwrite his config file if he has modifications in it (like apt-get on Debian/Ubuntu).
php.ini
should not be edited by the user. I documented how to customise the php configuration.
For some other apps it would be very useful to provide a custom configuration through scoop. As example nginx, the default configuration is not very useful without modifications. Replacing the parts of it with includes pointing at the presistent data store would be a easy solution Many Linux distributions do that, example debian for nginx original / debian version.
Unfortunately conjunctions are causing problems, not every program likes them 😞
As example nodejs breaks, fails at npm update -g
scoop persist *$% = .\bin\scoop.ps1 install nodejs
Loading node-v7.7.3-x64.msi from cache
Checking hash of node-v7.7.3-x64.msi... ok.
Extracting... done.
Linking ~\scoop\apps\nodejs\current => ~\scoop\apps\nodejs\7.7.3
Persisting data...
npm ERR! Windows_NT 10.0.14393
npm ERR! argv "C:\\Users\\remy\\scoop\\apps\\nodejs\\current\\nodejs\\node.exe" "C:\\Users\\remy\\scoop\\apps\\nodejs\\current\\nodejs\\node_modules\\npm\\bin\\npm-cli.js" "update" "-g"
npm ERR! node v7.7.3
npm ERR! npm v4.1.2
npm ERR! path C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\.staging\abbrev-35f81b38
npm ERR! code ENOENT
npm ERR! errno -4058
npm ERR! syscall rename
npm ERR! enoent ENOENT: no such file or directory, rename 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\.staging\abbrev-35f81b38' -> 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\npm\node_modules\abbrev'
npm ERR! enoent ENOENT: no such file or directory, rename 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\.staging\abbrev-35f81b38' -> 'C:\Users\remy\scoop\apps\nodejs\current\nodejs\node_modules\npm\node_modules\abbrev'
npm ERR! enoent This is most likely not a problem with npm itself
npm ERR! enoent and is related to npm not being able to find a file.
npm ERR! enoent
npm ERR! Please include the following file with any support request:
npm ERR! C:\projects\scoop\npm-debug.log
That's great that it works for MariaDB.
I wonder if this error is not so much because of the directory junction, but because there are already files in the node_modules directory when npm update
runs?
There seem to be similar rename
issues reported in this NPM issue thread, where people are suggesting running npm cache clean
, disabling anti-virus, closing other programs that might have file locks, disabling sharing (and then running npm cache clean
).
That's fine to delete the travelling dir code too.
That seems to be a different problem. npm cache clean
does not affect the problems.
As soon I use a conjunction for the node_modules directory strange things start to happen. Tested with fresh installs of nodejs.
If I uninstall a global module (npm uninstall -g glup-cli
, has to be installed first) the node_modules conjunction link gets deleted but not the gulp-cli package inside the persistet location.
If I update the global npm package npm update -g npm
the node_modules directory gets replaced with an empty node_modules/npm/node_modules directory.
I never had such problems with nodejs/npm without a conjunction. No idea what happens 😢
Oh okay. That's interesting. And disappointing.
It would be a shame if we have to give up on junctions and go back to something like the old travelling directories.
I don't know enough about junctions to give up yet. I'll do some investigation over the weekend.
Yes, that would be sad. I didn't test, but I think we will have the similar problems with elasticsearch as of #1202
Me neither, but I heard on Linux some programs have problems too when you run them inside symlinks, but it can be fixed with hardlinks or directory mounts.
I tested symlinks with admin permissions too, same errors. There is a very similar error but for macOS #8165
To reproduce you can either run npm update -g
, post_install
does it for you.
Or if you remove that npm install -g gulp-cli
and then npm uninstall -g gulp-cli
I think I might have stumbled on a possible way to make junctions behave with NPM.
Can you have a look at 4382206cfe5a83f82bce51ccdfb480e4774c4a34 and tell me if I'm tricking myself? With this change, NodeJS seems to install properly including the npm update -g
in post_install
, and you can run npm install -g gulp-cli
without errors.
I changed a few lines to get it to work for me, but the key line is
attrib $source +R /L
This sets the read-only attribute on the directory junction and seems to make it behave better with NPM. I'm not really sure why it works. I only found this by accident by unticking the Read-only checkbox in the folder (directory junction) properties with Explorer, which seemed to remove all the file attributes. And then I found this page which pointed me towards attrib [...] +R /L
.
Does this work for you?
It works! Thanks 👍
I had to changed the npm prefix to install the binary links into the data directory else they would not survive an update. 912fbe896761e3955df3b9d1a69bb6d2c2c22c46
Better ideas on how to add paths inside the data directory to PATH?
Has anyone experience with python and pip? I can't figure out what needs to be persistent.
@lukesampson I merged your commit and rebased my persist branch upon the current master branch. You are not longer able to merge your persist branch. If you want to make changes use my new persist branch.
Thanks, I've removed my persist branch now.
I don't know if this is a better alternative to what you have, but would it work to add ~/scoop/apps/nodejs/current/nodejs/node_modules
to the path? That would be the same path across updates, and might simplify things (if it works). I suppose the current
junction isn't created at that point in the install, but you can assume it will be as long as the NO_JUNCTIONS
config option isn't set.
If that works, it is probably still a good idea to use variable replacement in the env_add_path
property of the manifest to enable this, e.g. "env_add_path": "$currentdir\\nodejs\\node_modules"
would use the current
path and "env_add_path": "nodejs\\node_modules"
would use the actual version-specific install directory.
Sorry to nitpick while you're still working, but I noticed varying terms for the persisted directories and the variables and functions that reference them, like persist, data and appdata. Could these be consistent, e.g. persist
for the manifest and the root ~/scoop/persist
directory, $persist_link
and $persist_target
for the directory junction and actual directory? I think persist
is nice because it's generic enough to cover data and configuration and it reminds people of the intent and behaviour of these directories as well.
That won't work. The scripts are outside of the node_modules directory. With the default config they would be at the same level as the node.exe
.
I just realised we do not need the node_modules junction, we just need the npm module. 🤔
It would be enough to move node_modules
, npm
and npm.cmd
over and add the data directory to the PATH
. npm determines the install path from the npmrc
file which we change inside post_install
.
I think it would be useful if persist
can not only deal with junctions but also easy move and copy actions.
What do you think?
env_add_path
is currently relative to the $currentdir
I worked around that by checking if there is a $
in the value and threat it absolute and replace the variables.
No problem, I will change that.
I changed data
to persist
36e61b54f920e18868eaee1145ce6e11ee16e8b5
I don't know what you mean with $persist_link
, the $persist_dir
variable is the only introduced variable outside the persist_*
functions.
Thanks. Looks good! Yes, ignore the $persist_link/$persist_target
comment, I thought there might be multiple $persist
variables but you have it sorted.
Regarding env_add_path
, would it be worthwhile and possible to make ~/scoop/persist/[app]/[persisted]
fairly invisible to users so they just see environment variables pointing to ~/scoop/[app]/current/[persisted]
?
What sort of move and copy actions did you have in mind for persist
?
I am not sure if I understand you at the env_add_path
point.
Currently I am using a workaround/hack to add paths to the persist dir for nodejs and ruby. If there is a variable inside the path I treat it absolute and resolve the variables.
Maybe using traversal like ../../persist/$app/bin
might work too but seems more awkward.
I think if this one is sorted out we could merge the first version.
As example in the case of nodejs we initialize the global modules inside the persist directory and set a config variable to use that directory instead of a location inside the version directory. That makes the files inside the version directory obsolete (npm, npm.cmd and node_modules).
I think it would be cleaner and simpler to just move those files in that case.
Inside the manifest we could use a more future proof definition like "persist": [{"source": "nodejs\npm.cmd", "target": "npm.cmd", "action": "move"}]
. action
could be link
(default) or move
. I can not think of a use case for copy
.
With the optional object notation we could implement more complex stuff if needed
Hi, as persistent data now merged, how about adding run
or start
command?
so if we want to run redis or mysql server with a persistent configuration just hit
scoop start redis
# or
scoop run mysql
@nueko I think that might be outside Scoop's scope (it's an installer, not a service manager). But if you want to start a separate issue we could discuss what might be involved.
Currently, a file that doesn't exist in the apps directory after installation can't be made persistent. It will create a directory named like the nonexistent file and link it to the apps directory. Some programs create their config file on the first startup. Like smplayer (See: https://github.com/lukesampson/scoop-extras/pull/429)
How should we deal with these cases?
Ideally we would not try to persist the file on the first install if the file doesn't exist, but detect this situation on update
, copy the file from the previous version of the app and persist it then.
That's fairly complicated. If no one wants to implement it, I wouldn't mind leaving this situation as unsupported for now until we discover more examples. Maybe we could show a warning if the manifest tries to persist a file that doesn't exist, and stop the directory from being created incorrectly.
Running into this same issue with keepassxc
- keepassxc.ini
is not created until first run - failing to persist.
Reference issue: https://github.com/lukesampson/scoop-extras/issues/614
persist per architecture would be great too