syncthing
syncthing copied to clipboard
Display local ignored files feature.
Purpose
As a user I wish to see which local files are ignored by Syncthing in my receive-only folder.
A broader view of the matter will be clarified by following this draft PR: https://github.com/syncthing/syncthing/pull/9130
Testing
Edit folder ignore patterns and make sure items appears in the folder details. Other folders than receive-only should not be affected.
Screenshots
Documentation
https://github.com/syncthing/docs/pull/824
Great, very useful addition! Why limit it to receive-only folders though? The information is definitely useful for any folder type except receive-encrypted I suppose.
This looks great! As @acolomb has already mentioned, is there any particular reason for limiting this to a specific folder type? This would be useful in all cases, I think.
I would like to suggest using a different icon though, as the exclamation mark indicates that something is wrong, while this isn't really the case here.
fa-ban ?
Why limit it to receive-only folders though?
There is no specific reason to limit the feature for receive-only folders apart of more potential bugs and more testing.
Feature opened for send-only, send&receive and receive-only folders now.
Can we have a discussion about what this is supposed to do and what use case it fulfils? It looks to me like it implements a surprising definition of "locally ignored" and I'm not certain if I understand or misunderstand the purpose.
Can we have a discussion about what this is supposed to do and what use case it fulfils? It looks to me like it implements a surprising definition of "locally ignored" and I'm not certain if I understand or misunderstand the purpose.
In context of this PR it means "Locally existing files which are ignored".
Purpose: This PR is supposed to expose number and list of files which physically exist on local filesystem and which are ignored by patterns. Originally I aimed to receive-only folders, but I was asked to extend features for send folders as well if possible.
Why I need this feature ? This PR is intermediate step to achieve https://github.com/syncthing/syncthing/pull/9139 which extends this branch/PR by "Remove Ignored" button allowing user to delete all ignored files from local filesystem without affecting global state of the file and other devices.
Ultimate target is summarized here: https://github.com/syncthing/syncthing/pull/9130 (just note the code of https://github.com/syncthing/syncthing/pull/9130 is PoC quality)
In context of this PR it means "Locally existing files which are ignored".
That's not really what you're doing here, though. You're basing this on files in the database with the ignored flag, and there are roughly two ways a file ends up in the database with an ignored flag:
- The file was previously not ignored, so we've scanned and processed it and it's already in the database and announced to other devices. We can't undo that, so we update it with an ignore flag and send an update to that effect.
- Some other device announces a file that matches our ignore patterns. We tag it with the ignored bit and send an update that we've seen it.
Notably not present in the above list are all the other, "normal" ignored files, so as designed this will return a small arbitrary subset of actually on-disk ignored files.
I can see the utility in being able to ask for which files are present and ignored, it's come up before. However, I think the proper way to do that is to walk the filesystem, running the ignore matcher, and noting which files match the patterns.
(It's not obvious to me that this needs to be part of Syncthing proper as opposed to a separate small utility, either. With the stated end goal of deleting all the files, which is also something I'm not sure we should do, it makes even more sense to me to have this be a separate utility. stfindignored --delete-all --force ~/some/folder)
That's not really what you're doing here, though. You're basing this on files in the database with the
ignoredflag,
See protocol.go and follow protocol.FlagLocalIgnoredExisting please. I mean I'm not using ignore flag, but own.
Yes, you've added a flag to indicate whether the ignored file exists on disk locally. This doesn't change any of my reasoning above, since that is about whether the file exists in the database to begin with, regardless of flags. (It's also incorrect and/or unnecessary since the "exists on disk" part is better figured out by looking on disk.)
Yes, you've added a flag to indicate whether the ignored file exists on disk locally. This doesn't change any of my reasoning above, since that is about whether the file exists in the database to begin with, regardless of flags. (It's also incorrect and/or unnecessary since the "exists on disk" part is better figured out by looking on disk.)
Ok, I understand point is that files added locally matching ignore patterns will be missing in my list. Then it makes sense to extend scan (walkAndHashFiles) related functions to add these file to local tree with appropriate flag(s) and without hashing to reduce performance impact.
The file was previously not ignored, so we've scanned and processed it and it's already in the database and announced to other devices. We can't undo that, so we update it with an ignore flag and send an update to that effect.
If the file is ignored, and added just locally, it should not be announced to other devices of course, but still can be listed in local database without being hashed. If such file would be removed from the filesystem later, it can be simply removed from local database since it is not published anywhere, instead marking it deleted or ignored. Does this make sense ?
I can see the utility in being able to ask for which files are present and ignored, it's come up before. However, I think the proper way to do that is to walk the filesystem, running the ignore matcher, and noting which files match the patterns.
(It's not obvious to me that this needs to be part of Syncthing proper as opposed to a separate small utility, either. With the stated end goal of deleting all the files, which is also something I'm not sure we should do, it makes even more sense to me to have this be a separate utility. stfindignored --delete-all --force ~/some/folder)
I'm trying to avoid external tool since it diverts from my target "selective downloads for receive-only folder". Also doesn't make sense to scan the filesystem once by syncthing and secondly by another tool. As "advanced" user I would like to see which files will be removed before I click it. At least it would be more safe when user enters the ignore patterns by hand and can easily make a typo or other stupid mistake.
My main question before I invest more time in this is:
Is there a real chance that solution described above will be accepted ?
Even ultimate target would be new type of folder "receiveonly-selective" adding automatic file removal according to patterns. I believe such feature, allowing comfortably download music or pictures from cloud to a device, will satisfy many users asking for selective sync.
btw. I'm sorry if I'm slower in understanding. I'm just getting familiar with the codebase and syncthing complexity. :-)
Is there a real chance that solution described above will be accepted ?
Thank you for this summary. I do think the feature would be extremely valuable and useful. Just have to agree with @calmh that a simple "scan the filesystem, build a list of names matching the configured ignores, in memory" would be a more realistic approach. The database can get quite big (1 - space) and busy (2 - I/O bandwith), and looking up items there probably gets slower (3 - speed) with more records stored. That's three rather sensitive areas where a user who explicitly stated that some (possibly large) portion of the folder tree should be ignored (thus, not processed at all).
Whether the list is built and kept in memory during regular scanning, or produced on demand when the API endpoint is requested, is a basic decision to make. Or maybe have both, with a threshold size: Cache the list during regular scans, but discard it again if it exceeds e.g. 10 MB. Then the API relies on the cache when available, otherwise triggers a new scan just for this purpose, without the size limit.
I do like the idea of having it integrated right there in the GUI, however in practice and with a growing number of ignored paths, a CLI tool will actually be more useful for piping to other processes (e.g. for removal). That could very well be a thin wrapper (or curl ... | jq ...) around the API endpoint though.
Even ultimate target would be new type of folder "receiveonly-selective" adding automatic file removal according to patterns. I believe such feature, allowing comfortably download music or pictures from cloud to a device, will satisfy many users asking for selective sync.
I see more problems than possible benefits with adding a new folder type. And in general, automatically deleting anything from within Syncthing, especially when it matches an ignore pattern, is a function I'd rather not see implemented. There's a large potential for catastrophic bugs here (think about our external globbing library for example - BOOM your data is gone after an update). If only made available via an explicit user interaction, it's still a giant footgun, but at least won't eat your data without a directly related, explicit user choice.
We could put a higher hurdle for accidental damage here by providing it as a separate command that is however easily set up with a trigger after a scan-finished event.
extend scan (walkAndHashFiles) related functions to add these file to local tree with appropriate flag(s) and without hashing to reduce performance impact.
I don't think this is a good idea. There are many reasons for ignoring file trees, including that scanning it may cause errors or be expensive (there might be millions of files there).
automatic file removal according to patterns
This has been suggested several times, and it is rejected based on philosophical reasons. We will not do it.
Whether the list is built and kept in memory during regular scanning, or produced on demand when the API endpoint is requested, is a basic decision to make. Or maybe have both, with a threshold size: Cache the list during regular scans, but discard it again if it exceeds e.g. 10 MB. Then the API relies on the cache when available, otherwise triggers a new scan just for this purpose, without the size limit.
I don't think this is a good idea. There are many reasons for ignoring file trees, including that scanning it may cause errors or be expensive (there might be millions of files there).
I think, user can decide himself whether he wishes to enable the feature accepting the performance impact or not. So another checkbox in folder settings or patterns tab should solve huge file trees performance concern.
Another option might be extending the ignore markup by new prefix telling that the ignored item should be scanned and listed for removal. This would add more flexibility to advanced users who wishes ignore some files but not remove all. Typically config files or thumbnails folder. And I should be still able to achieve my goal without affecting current behavior and performance.
There's a large potential for catastrophic bugs here (think about our external globbing library for example - BOOM your data is gone after an update)
This has been suggested several times, and it is rejected based on philosophical reasons. We will not do it.
Makes sense... Delete confirmation overlay should show all affected files so user will see and confirm files/folders to be removed.
I think this PR together with https://github.com/syncthing/syncthing/pull/9139 and https://github.com/syncthing/syncthing/pull/9132 would give a closure to as good as all Selective Sync requests.
As a movie production company that is using Syncthing between all our artists, being able to quickly select what folders to sync and the posibility to remove the non-syncing files/folders that is located in the synced-folder would be a game-changer for us in production speed.
After trying to folow the discussion here with showing all non-syncing local files and the abbility to delete these, this is my thoughts.
What about when viewing all locally ignored files they would be shown in a tree-view like the last image in https://github.com/syncthing/syncthing/pull/9132 but with a delete-button next to each folder and item. That way the user have to manually select what to delete, so no auto-deletion going on, but you can also quickly delete everything within a branch of the tree-view. If you would delete the root branch you would delete all the locally ignored files.
@p0l0us What's your plan here? As pointed out currently you aren't doing what it sounds like you want to do: Track all ignored files that exist in the db. You'd have to scan them to do that. However as pointed out, that's definitely not desirable by default. And I personally doubt it's desirable in general: Besides what calmh said, it also adds complexity to track all this data just for a potential use of showing and deleting ignored files. This isn't information continuously and often used. It could be generated when needed directly from the filesystem.
@imsodin
I really need to know what files are exactly not being syncthed by some kind of list or report. This is because I particularly
do a cross-platform send&receive sync between Android-Linux-Windows-MacOS and some OSs have naming forbbiden characters which are uncompatible, ie, an a:a.txt file on Android/Linux can't be syncthed to Windows as : is an illegal character.
In relation to this, recently I noticed this behaviour of syncthing windows client: all files wich contains the illegal characters: | \ / * " are not being syncronized by default regardless the .stignore rules, this not happens to files with the illegal characters : ? so it´s needed to add the *[:^?]* ignore pattern rule to avoid them. In this case at Windows I'm currently completely blind to know what are the source of the ignored files/directories on the remote shared folders in order to try to fix them manually by my own.
You said this information about the remote ignored files could be generated directly from the filesystem.
So, how could I exactly generate that currently ignored file list? I really need this feature. Thank you very much in advance.