desktop
desktop copied to clipboard
client removes file on server if it could not create VFS placeholder file
How to use GitHub
- Please use the 👍 reaction to show that you are affected by the same issue.
- Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
- Subscribe to receive notifications on status change and new comments.
I found some other issues where the client deleted files (like #1433) but my issue looks like specific VFS problem.
Expected behaviour
client should not remove server file if creation of local placeholder file fails (maybe mark local file as "dirty" or fallback to full file download)
#=#=#=#=# Propagation starts 2021-06-14T13:01:05Z (last step: 368 msec, total: 368 msec)
||InstantUpload/Camera/20200306_234227.jpg|64|2|1583530946|638ef5870efff25b01a8063e9cfc17b7|4046796|00748461occ5uwmhix0f|2|Couldn't create placeholder info|0|4046796|1583530946||
||InstantUpload/Camera/20200314_120134.jpg|64|2|1584180094|9e934f897c5641ba2c4bc21139f7e5ee|9056787|00748459occ5uwmhix0f|2|Couldn't create placeholder
Actual behaviour
For some reason client could not create placeholder files. I have no idea why it even tried to create placeholder files - majority of affected files existed and successfully synced months before (at least client reported successful sync).. as a result a client folder had no files anymore and subsequent sync removed all the files on the server side. Fortunately this happened to few hundreds of files and I could recover the files from server trashbin.
but the problem still existed and the client complained it could not create placeholder file, and removed the files again!
at the same time access to existing placeholder files (with blue cloud icon) was not possible - the error was "0x8007016A The cloud file provider is not running." The error is often reported for OneDrive - and computer restart is recommended as solution. After client restart I can access placeholder files again. I have no clue how to find out if some process or service crashed - eventlogs don't say anything
Steps to reproduce
no idea.
maybe this is related: short time before I added huge folder with my photo archive (600GB,>50k files) but from my feeling it was synced successfully (with VFS) but might introduced some performance issues problems with the client. The client feels totally overloaded - click on settings or properties of every folder results in minutes of unresponsive UI and client using lot of CPU
the file was initially removed by the client and I recovered it from server trashbin. removed files didn't reside in this huge folder but on the other one InstantUpload and 2-3 others (completely random in my eyes). After this happened 2 times I switched the specific folders to "Always available locally" and the client successfully downloaded all the files.
Client configuration
Client version: Version 3.2.2stable-Win64 (build 20210527)
Operating system: Win 10 1909
OS language: EN
Installation path of client:
Server configuration
Nextcloud version: 21.0.2 (docker/apache) Storage backend (external storage): mysql
Logs
I have debug archive and Nextcloud_sync.log from the problematic period, but I'm not willing to upload complete file due to privacy reasons (debug archives are up to 25MB file with extracted log of 300MB). Please advice how to find and extract and
-
Client logfile: Since 3.1: Under the "General" settings, you can click on "Create Debug Archive ..." to pick the location of where the desktop client will export the logs and the database to a zip file. On previous releases: Via the command line:
nextcloud --logdebug --logwindow
ornextcloud --logdebug --logfile log.txt
(See also https://docs.nextcloud.com/desktop/3.0/troubleshooting.html#log-files) -
Web server error log:
-
Server logfile: nextcloud log (data/nextcloud.log):
@isdnfan can you upload logs here https://cloud.nextcloud.com/s/botn4JSYfMR83xt
Hi @allexzander thank you for your time. I uploaded the debug archive nc-vfs-camera.zip and Nextcloud_sync.log. From the latter it looks like the sync cycle at 10:07 failed to create lot of placeholder files
#=#=#=#=# Propagation starts 2021-06-14T10:07:20Z (last step: 43256 msec, total: 43256 msec)
||InstantUpload/Camera/20200403_105701.mp4|64|2|1585904238|82ed7969538b101de46ad76200aae59f|28445625|00748486occ5uwmhix0f|2|Couldn't create placeholder info|0|28445625|1585904238||
||InstantUpload/Camera/20200408_175417.jpg|64|2|1586361256|3da2601dcb91e63ed48917809c7a86f1|8062342|00748496occ5uwmhix0f|2|Couldn't create placeholder info|0|8062342|1586361256||
||InstantUpload/Camera/20200408_175418.jpg|64|2|1586361258|efdcc469711e24e715053592fe059068|7272802|00748497occ5uwmhix0f|2|Couldn't create placeholder info|0|7272802|1586361258||
and then something happens at 2021-06-14T10:08:22 - most likely the deletion but I can't see it easily from this log
10:08:22||InstantUpload/Camera/20200408_175418.jpg|2|1|1586361258|efdcc469711e24e715053592fe059068|7272802|00748497occ5uwmhix0f|4||204|7272802|1586361258|3f7140fa-4854-4ea7-adc6-bf047d1b4c8f|
10:08:22||InstantUpload/Camera/20200408_175417.jpg|2|1|1586361256|3da2601dcb91e63ed48917809c7a86f1|8062342|00748496occ5uwmhix0f|4||204|8062342|1586361256|b7a0e688-eccb-44f2-9f84-7224ce7b57cd|
10:08:22||InstantUpload/Camera/20200403_105701.mp4|2|1|1585904238|82ed7969538b101de46ad76200aae59f|28445625|00748486occ5uwmhix0f|4||204|28445625|1585904238|edb92e1b-5181-4ef7-989d-4cef229d748f|
In total the problem happened 2 times on this day at 10:xx I restored the files around 13:00 and they got deleted again an hour later ~14:xx
@isdnfan Thank you. We have received your logs.
Are you able to create files manually in the location mentioned in logs? Can those files be synced if you disable the VFS mode and use same location for your local sync folder?
As I stated above:
After this happened 2 times I switched the specific folders to "Always available locally" and the client successfully downloaded all the files.
But I think there was some general problem with VFS - as was unable to access files not hydrated at this time.. (error "0x8007016A The cloud file provider is not running.") after reboot access to the non-hydrated files worked again..
now I switched the folder back to "free up local space" and files replaced by placeholders within seconds.
UPDATE: #3452 and #3447 look related for me. I could imagine that I killed VFS by sending the client to sleep, or maybe I even killed the client when it was unresponsive for long time.
I had a similar but not exactly the same error today:
- I had my whole Nextcloud synched to hard drive
- I enabled virtual files for two folders containing a few thousand files
- NextCloud Desktop gave me many (many many!) warnings:
- To stop the obviously not working process, I deactivated virtual files
- Instead of now downloading the original files again, Nextcloud desktop started deleting about 3,500 files from these folders on server
- I had to undelete the files from trash, then Nextcloud Desktop synched them to hard drive again
I'd be willing to share my debug file, but not in public.
This bug report did not receive an update in the last 4 weeks. Please take a look again and update the issue with new details, otherwise the issue will be automatically closed in 2 weeks. Thank you!
for me the issue did not repeat anymore but so far I see the root cause was never found so the problem could reoccur.
@Discostu36 If you happen to have this debug log still around, would be nice if you upload it to https://cloud.nextcloud.com/s/ozbSCx5wGDrtRGQ
@isdnfan It could've been improved by fixing numerous other bugs with VFS between 3.2.2 and 3.3.0.
@Discostu36 If you happen to have this debug log still around, would be nice if you upload it to https://cloud.nextcloud.com/s/ozbSCx5wGDrtRGQ
I deleted it some days ago, but might still be in trash, will have a look this evening.
@Discostu36 Sorry for not paying attention to it earlier. Somehow, I've missed your reply. If the file is not there, you may also want to try the latest 3.3.0 version to see if the issue is still there. https://nextcloud.com/install/#install-clients
It could've been improved by fixing numerous other bugs with VFS between 3.2.2 and 3.3.0.
@allexzander it's true I have the feeling new version works better in terms of performance and stability. But I didn't see any change addressing three underlying problems we see here:
- the client always consider local file state as reference: if the file doesn't exist locally this files is removed from the cloud
- if there is a problem with VFS (client fails to create placeholder file) the client doesn't remember the problem
- if there is a change/new file on the server the client touches/syncs the whole directory (which in my case made other placeholder files to fail as well so the problem of one file was multiplied to all files in the directory) and as a result it removes all the contents of the directory where this file lives, including previously synced files untouched for ages
first issue is hard: I don't know if there is good solution. It is easy to monitor file changes while the client is started but what should happen if the user removes local files while the client is stopped? prefer server/client or raise conflict? I would prefer manual problem resolution in this case. Additional interaction is bad in terms of user experience but it gives the user a chance to avoid data loss (depending on server trashbin setting file may be completely removed). As reference: MS OneDrive has additional confirmation to remove files in the cloud if the user removes lot of files locally..
second issue is easier to handle in my eyes:
- if there is a problem with VFS mark/remember this file is broken locally and try to sync from server next time (repeat until successful)
- If there are multiple issues with VFS completely stop syncing to avoid the situation with different state on the next run..
I have no idea about the third one - in my eyes there is no reason to sync/touch the whole directory if only one file is changed.. Maybe there is a reason but even then the client should become more fail safe - if there is some local problem (VFS problem, local hardware issue, exhausted local storage, permissions problem) the client should stop syncing until the problem is resolved.. (or even better only upload new files to the cloud)
I really appreciate you feedback (at least as documentation about how it is expected to work). Feel free to close the issue if you feel all this problems are addressed already or not relevant at all.
It could've been improved by fixing numerous other bugs with VFS between 3.2.2 and 3.3.0.
@allexzander it's true I have the feeling new version works better in terms of performance and stability. But I didn't see any change addressing three underlying problems we see here:
* the client always consider local file state as reference: if the file doesn't exist locally this files is removed from the cloud * if there is a problem with VFS (client fails to create placeholder file) the client doesn't remember the problem * if there is a change/new file on the server the client touches/syncs the whole directory (which in my case made other placeholder files to fail as well so the problem of one file was multiplied to all files in the directory) and as a result it removes all the contents of the directory where this file lives, including previously synced files untouched for ages
first issue is hard: I don't know if there is good solution. It is easy to monitor file changes while the client is started but what should happen if the user removes local files while the client is stopped? prefer server/client or raise conflict? I would prefer manual problem resolution in this case. Additional interaction is bad in terms of user experience but it gives the user a chance to avoid data loss (depending on server trashbin setting file may be completely removed). As reference: MS OneDrive has additional confirmation to remove files in the cloud if the user removes lot of files locally..
second issue is easier to handle in my eyes:
* if there is a problem with VFS mark/remember this file is broken locally and try to sync from server next time (repeat until successful) * If there are multiple issues with VFS completely stop syncing to avoid the situation with different state on the next run..
I have no idea about the third one - in my eyes there is no reason to sync/touch the whole directory if only one file is changed.. Maybe there is a reason but even then the client should become more fail safe - if there is some local problem (VFS problem, local hardware issue, exhausted local storage, permissions problem) the client should stop syncing until the problem is resolved.. (or even better only upload new files to the cloud)
I really appreciate you feedback (at least as documentation about how it is expected to work). Feel free to close the issue if you feel all this problems are addressed already or not relevant at all.
Do you have any information that could help understanding why the files could not be created ?
The idea is that to improve reliability we can always make it try again but that would only partially solve your problem.After all you want your files ?
@mgallien I don't get your point. I have no idea what caused the problem. I remember at this time the client was unstable - it was eating CPU, created huge local DB files, actions in the UI lagged. I'm sure I killed the client multiple times, additionally I might have stopped some action when I suspended the PC when it was in the middle of some operation. as a result VFS was broken (placeholder files creation) - the rest was fine, the client successfully downloaded all the files from the affected folder after I changed to "make available locally". new client versions look more stable so the issue might not happen anymore (or less frequent). But the issue uncover some facts about the sync process which could be improved to make the client more safe - especially the fact the client doesn't remember local state is unhealthy and for this reason removes files from server is really bad and worth attention. (and fact why the whole folder is touched when only one file changes).
I'm happy to discuss the logic of the sync process - maybe I don't understand something. the issue happened here described more generic:
- the file exists on the server and doesn't exist on the client
- the client removes the file from the server
in my eyes the logic must be different
- when the client performs first sync after re-start
- previous sync was not successful/had errors
in other words a client must not delete files on the server until it is confident the local state is healthy and holds a full copy of user files. otherwise it should replicate server state because the server is the only instance which knows what happened in the time the client didn't run/didn't properly sync.
I'm currently experiencing the same problem. Two days ago, the client (3.3.2 on Windows 10, "wincfapi" used for virtual files) deleted 12.000 files, ~55GB in total. They are still in the trashbin, so they are not completely lost, but restoring them via the web frontend is 1. not feasible and 2. didn't work because of file locking exceptions (I'm using Redis for file locks in case that's relevant). Is there a way to restore them via CLI? Otherwise I'm already working on locally restoring a server backup and uploading the files through the client (on a different machine than the client that deleted the files). I guess I'll have to resolve the file locks prior to that, though.
Support for virtual files is now disabled in the client. I have create a debug archive, but the logs don't cover the relevant time frame, they start today.
The client (that deleted the files) is installed on a new laptop. At first, the sync with virtual files seemed work fine (no DELETE calls in the web server log), so it was not an issue with the initial setup of the client. First call in the webserver logs: 2021-08-28 12:48:20 +0200 First DELETE call: 2021-09-03 23:28:35 +0200 The client was running fine for 8 days and then started to delete all files.
The log excerpt is from today, prior to disabling virtual files. The filenames were the ones I tried to recover in the web frontend.
thank you @ImanuelBertrand showing the issue exists on new versions as well. Do you have any idea what might be the root cause why placeholder files failed to create? did you recognize any atypical pattern, any issues (maybe with other programs)? something interesting in the windows eventlogs?
because the server is the only instance which knows what happened in the time the client didn't run/didn't properly sync.
This is not true if a server was migrated, then the client knows better what was added/removed while the server was down. However I agree that avoiding data loss is the main objective. Thus files should be kept on the server if one cannot be 100% sure deleting them was triggered by the user.
Adding files or keeping them by mistake is a lot less of a problem compared to silently deleting files.
I remember nothing atypical.
2021-09-03 23:25:15 +0200: EL: System booted 2021-09-03 23:25:16 +0200: EL: Error related to Microsoft Wi-Fi Direct Virtual Adapter 2021-09-03 23:25:25 +0200: EL: Windows complained about an incorrect license key (fixed in the meantime) 2021-09-03 23:25:25 +0200: EL: Many entries of "Die Anmeldeinformationen in der Anmeldeinformationsverwaltung wurden gelesen." - something like "The login credentials in the user account management have been read" 2021-09-03 23:25:37 +0200: EL: Windows defender status sucessfully set to SECURITY_PRODUCT_STATE_ON 2021-09-03 23:25:58 +0200: ManicTime started tracking 2021-09-03 23:27:09 +0200: Started Dell Mobile Connect 2021-09-03 23:28:44 +0200: Opened Settings to uninstall Dell Mobile Connect 2021-09-03 23:30:08 +0200: EL: System powered down
The entries prefixed with EL are from the Windows event log. The difference between the system clocks of the laptop and the server (regarding the server logs above) is within one second. The system is a Dell XPS, if that's somehow similar to your system.
@mgallien I don't get your point
there is tow points in your reply
- rest assured that we share your concerns about reliability and many of the changes done in versions 3.2 and 3.3 are done for that and unfortunately we are not yet done
- I would like to also understand how to trigger that such that the trigger condition can be removed
my point is that any errors in the placeholder files creation will block sync and I guess that is not what you want we then need to understand why this is happening no matter how much time we spend on improving sync reliability
They way I see it, blocking the sync would be preferable to deleting files. Of course that distinction is only relevant as long as the issue is not found and fixed. I would have been grateful for a message like this: "Couldn't create placeholder info. Retry or proceed (proceeding will delete affected files on all devices)". I mean, "grateful" would probably be an overstatement since it is still an error message, but I'd rather have an annoying error message than data loss.
@mgallien
my point is that any errors in the placeholder files creation will block sync
this is exactly what I suggest (at least as long the client works as it does now). because not stopping the sync results on files deleted on subsequent run.
@ImanuelBertrand
I would have been grateful for a message like this: "Couldn't create placeholder info..."
exactly: if you stop syncing and give the user good hint where/how to provide a bug report chances are higher you get the reports and data you are looking for.
silently going forward and removing files results in
- high risk of data loss (because users may not recognize some data was lost - for me it was a lucky punch I saw this in activity feed)
- completely uncoordinated users reactions (reports through different channels and different wording)
I completely agree with @ImanuelBertrand - hard fail is better then continue somehow and cause data loss (or at least lot of work for restore). I think we all agree this is a really bad situation which should never happen - but it happens.. three users managed to identify the problem and report the issue to same bug report within short time since VFS was released. I suggested some mitigations - I have no idea if this are suitable or complete nonsense - I didn't receive any response..
svenb1234
This is not true if a server was migrated, then the client knows better what was added/removed while the server was down.
I only partially agree. In general we must consider the server as the most stable part. The scenario you show only works as long only one client is involved. what is if you have 5 clients? should each client move/add/change files only because the local data is different from the server? what if you restore the server for some reason?
there are lot of moving parts in the system - but in general the sync is build around the server - this one must be the "root of trust". If the server has crashed, has been restored - admin should perform some action (maybe the is a way to automate it) to inform the clients new full sync is required.. other way round the client should never take priority over the server. it must only perform actions when it knows this action is intended e.g. remove the file only after a successful sync cycle (full sync on start).
definitely there are ways to improve the sync like keeping a journal (like database transaction log) so one doesn't have to rewind all the history - but in general every endpoint must ensure it doesn't work on invalid/incomplete data set..
This just bit me too. Deleted close to 40k files from my server before I noticed.
I believe a similar issue exists with VFS disabled, possibly caused by long file paths with more then 260 characters. Either way, when I rename specific folders on one client, the other client subsequently deletes all contents of the newly renamed folder.
There are also reports of the client deleting files from the server if the clients disk runs out of free memory.
I think the underlying issue ist that the client doesn't remember if it fails to create a file locally. Ignoring the issue and continuing as if the file was successfully created is just asking for trouble.
recently noticed #3731 - the client stops sync cycle if a file blocked by AV program - all good (despite the fact it crashes) - something similar must happen for other problems - if the sync fails for some reason inform the user and stop until the problem is fixed.
Happened to me as well after enabling VFS. We went back and disabled VFS again since this never happened before.
nextcloud: 21.0.5 client: 3.3.5
another problem #4016 which could be avoided by measures I suggested before
the client always consider local file state as reference:
which must not be the case!
In general we must consider the server as the most stable part
Client 3.4.1/windows, server 22.2.0, this problem still happening with VFS.
Had the same problem within my organisation, 6800 files deleted. Prior the windows client had an sync error and then deleted the files. Virtual Files enabled.
at all we have integrated a fix that should solve this issue https://github.com/nextcloud/desktop/pull/4191 please if you can test and provide feedback, that would be highly appreciated