dokany
dokany copied to clipboard
Parallel calls to CreateFileMapping cause deadlocks
Environment
- Windows 8
- Intel Celeron N2840
- Dokan v.120, driver version 0x190
- Dokany
Check List
- [x] I checked my issue doesn't exist yet
- [x] My issue is valid with mirror default sample and not specific to my user-mode driver implementation
- [x] I can always reproduce the issue with the provided description below.
- [x] I have updated Dokany to the latest version and have reboot my computer after.
- [ ] I tested one of the last snapshot from appveyor CI
Description
I'm developing a desktop application that is supposed to keep its configuration and temporary data in a custom dokan filesystem. Recently I finished the filesystem part. First I tested it against winfstest and some basic programs including explorer, cmd and notepad. After fixing few bugs it worked perfectly. Thus I proceeded to testing against my desktop app.
The application works with the stored data both directly and via the SQLite database. When the data directory is on a regular filesystem, there is no problems. And when run with my dokan filesystem, the app hangs at random moments, sometimes for 15-20 sec, sometimes permanently.
I checked against the mirror.exe, the problem exists too. So the next experiments were done only against the mirror mount.
The ProcMon showed that the locks happen in the calls to SQLite. I checked some other programs known to use SQLite, namely Firefox and Chrome, and they also experience the very same problem. Then I experimented with SQLite alone. Wrote a program with a simple request, ran it in few parallel loops. The deadlock were still there.
To isolate the problem even further, I made a standalone program that mimics the operations of sqlite and still suffers from deadlocks. After few trials I found the minimal set of operations that results in deadlocks.
Conclusions:
- The sequence of the following 3 operations: CreateFile, SetEndOfFile, CreateFileMapping cause dealocks if we run it in parallel against the same mirrored file
- The deadlocked calls in the most part return after 15-20 sec with status INSUFFICIENT_RESOURCES
- With high probability, the following rule holds: Have deadlocks = App threads > Dokan threads
How to reproduce.
I provided the source and binary for the aforementioned program (named mmove.c) and a helper scripts. The loop.bat just iterates the mmove.exe 1000 times over the specified file. The run.bat starts the specified number of parallel sessions running loop.bat The usage is described inside bat files. Link: deadlocks.zip
- Download and unzip attached archive to the test dir
- Compile mmove.c to mmove.exe or use provided binary (I don't guarantee its safety). Using MINGW it's simply "gcc mmove.c -o mmove.exe"
- Start the mirror with e.g. 3 threads: "mirror.exe /r d:\some\real\path /l h: /t 3"
- Start monitoring with ProcMon with autoscroll enabled and the following filter added: "Process name - is - mmove.exe - Include"
- In the test dir run the command: "run 5 h:\tmp" (this starts 5 sessions over the h:\tmp), the deadlocks happen almost immediately, seen as a stopped event flow in the ProcMon.
- Repeat p.3 and p.5 with various thread count combinations, see the difference.
Now I'd like to see if someone else can reproduce the problem. And, of course, to get it fixed. That is, there should be no deadlocks at any combination of dokan/app threads, am I right?
Logs
I've attached the mirror.exe and ProcMon logs for a run with 2 mirror threads and 3 user threads. The deadlocks may be seen with "Duration - more than - 0.1 - Include" filter, or just search for "INSUFFICIENT".
Hi @samokhodkin ,
Thank you for the report ! I will try to look at it quickly and give you a feedback on this 👍
I believe I've encountered the same or similar problem with apps such as XnViewMP hanging or crashing when launched in the container as reported here: https://github.com/bailey27/cppcryptfs/issues/42 Increasing threads seems to help a bit, but hangs happen regardless
Hi @nevubm ,
Regarding the increase of thread helping the situation. Would it not mean the deadlock is happening in the mirror ? Have you tried to attach a debugger in the mirror to see what's happening in it at this moment ?
I now suspect that my issue may have been related to Kaspersky Antivirus, since the issue goes away after uninstall. Simply disabling it doesn't help.
So for others as a workaround, perhaps try uninstalling any antivirus or other drivers that can affect disk I/O.
@nevubm From the behavior you described before, yes anti-virus can be source of noise for the dokan communication.
Otherwise this issue can be reproduced on a clean environment without anti-virus.