dokany
dokany copied to clipboard
Rewrite user-mode threading to work with overlapped IO/IOCP
The current threading model used by Dokan is a major performance bottleneck in any user-mode driver that can make asynchronous IO calls - for example a driver that provides access to an FTP server. The OS can potentially make thousands of IO calls a second but with a limit of 15 threads and no async callback mechanism those threads will spend most of their time blocking on IO operations (in the FTP example they will block on socket read/writes).
To solve this I propose the following:
-
DeviceIoControl()
in dokan.c needs to use an IO completion port for dispatching IO events to the user-mode driver. IO events will automatically be handled on the system thread pool and neither Dokan nor the user-mode driver are required to manage their own threads nor worry about the number of threads that are allocated. - A flag should be added to Dokan context information provided to the event handler that allows it to specify whether or not the event will be handled asynchronously. This is because the end of an asynchronous call requires special handling - it must explicitly tell Dokan to send a response to the kernel.
- The Dokan user-mode
Dispatch*()
functions will need to be split in two: one half (BeginDispatch*()
) for event setup and the call into the driver event handler and one half (EndDispatch*()
) for validation, cleanup, and sending the result to the kernel. Finalization for an asynchronous event (EndDispatch*()
) can be initiated by aDokanFinishAsyncEvent()
function called from the driver that takes a pointer to the context information as an argument. - Currently information is passed to the user-mode driver and back to the kernel via function arguments. This would need to be changed to a pointer to a structure that's internally held by the Dokan event context info so that
DokanFinishAsyncEvent()
knows where to find the information it needs to pass back to the kernel. - Dokan context objects should be pooled and push/popped from a global stack to prevent the driver from constantly hitting the memory allocator. A robust implementation should also include time stamps so that a number of context objects in the global pool beyond some threshold can be cleaned up after some amount of time passes to account for large temporary spikes in activity.
Unfortunately this both a non-trivial undertaking and would require breaking API changes however my driver is already running into problems with throughput due to restrictions imposed by the current Dokan threading model.
Great propsition @Corillian ! I totally agree with you :heart:
Removing the threads limitation would enhance a lot the dokan speed also !
If I understood well, most of the changes are going to be in the Dokan Library and we are keeping the DeviceIoControl
for communication Library/Kernel, are you we gonna break API compatibility in this communication ?
About point 5, is it not related to https://github.com/dokan-dev/dokany/issues/148 ?
It seems that you have a lot investigate around how to make it, do that mean it is something that you are already planning to do ?
About point 5, is it not related to #148 ?
Yes point 5 is identical to #148 except I don't believe the lookaside list API is available in user mode.
If I understood well, most of the changes are going to be in the Dokan Library and we are keeping the DeviceIoControl for communication Library/Kernel, are you we gonna break API compatibility in this communication ?
All of the changes would be in the user mode dokan.dll. It will keep DeviceIoControl()
and it will involve breaking API changes as all of the event handler functions will need to have their arguments packed into (pooled) dynamically allocated structures.
I have no problem doing this change myself however I don't have a lot of time at the moment. There's a good chance I wouldn't be able to get to it until sometime in June at the earliest but that could change (for better or worse).
Oups yes lookaside does not exist for user mode ofc :smile: sorry
Ok I see the breaking changes you are talking about. It would be nice if we can just add a proxy fonction for each event that would convert the dynamically allocated structures into current function parameters and the other way back at the user function return. If we really cannot avoid it, we will deal with it. It is just that I really would not like to force people to update all their implementation every version :)
Don't worry about time, we also currently have couple of things that we need to do to push for pushing this 1.0.0 out :disappointed: (some stability issue)
I have also been happily and successfully using Dokany from the beginning and have also run into the bottleneck of waiting on IO.
I think a much easier solution that does not break our API, would be for the driver to handle the return value STATUS_PENDING from (at least) user mode function dokanOperations->ReadFile.
All existing Dokany projects would continue to operate, and developers that want concurrent IO could do their own that worked by returning STATUS_PENDING in conjunction with IO handler thread(s).
Driver can only communicate to the user mode if the user mode tells the driver "here you have a worker". How would that happen using the approach suggested by @Corillian? Need to understand the full flow.
The approach suggested by @JohnOberschelp seems to me much simpler. But in this case one thread will still be blocked. If driver gets STATUS_PENDING it will make the same call again using maybe even the same thread, until it gets another STATUS. Here we are limited in the amount of threads resp. workers provided to the driver. Possible solution here is to create a new additional thread resp. worker in user mode, if user mode returns STATUS_PENDING. STATUS_PENDING means this thread is still working resp. thread is blocked. We need to keep the old thread running with setting a special async flag in the context like suggested by @Corillian. If once this special threads have been handled properly those can be terminated. The thread amount specified in dokan options is just the minimum amount of threads which will be handled in parallel.
In generally we can increase the amount of threads in the dokan options to 64. This can be done already yet.
You are right, @marinkobabic, one thread will be blocked, but not the rest of the threads.
Just by implementing handling of STATUS_PENDING on reads, and implementing lazy writes in userland, all Dokany threads would be able to go as far as they can till they are I/O bound. For read in particular, all the driver may need to do (as @marinkobabic said) when it receives a STATUS_PENDING, is yield to other threads, and try again; the read request is still completely valid. @Corillian, wouldn't this be all you need to get your FTP requests running in parallel?
DokanMain()
will create an IO completion port, bind it to a thread pool (https://msdn.microsoft.com/en-us/library/windows/desktop/ms686760(v=vs.85).aspx), open the device with overlapped IO enabled, bind the device to the IO completion port, queue an async DeviceIoControl()
, and then return (the current implementation blocks) while passing a state object back to the user. When a completion packet is available it will be dequeued on the thread pool bound to the completion port and a callback provided by dokan.dll will be called. This callback will then call the appropriate BeginDispatch*()
for the packet which will remove a dispatch context object from the object pool (or allocate a new one if the pool is empty), fill in its arguments according to packet type, and then call into the user mode driver via the callback specified during Dokan setup (i.e. dokanOperations->ReadFile
).
Inside the user mode driver callback a pointer to a structure with the relevant arguments for that packet will be provided. It should be done this way because we've then gone through the trouble of creating a proper structure to marshal packet arguments so that driver writers don't have to constantly reinvent the wheel for their own async operations. They just pass around a single pointer and reference it as needed. If the user mode driver callback will be asynchronous it will return STATUS_PENDING which tells Dokan to immediately exit the completion packet callback so that the thread can begin servicing other completion packets. If it's a synchronous callback it'll return its normal error code and Dokan will then automatically call EndDispatch*()
to release the dispatch context object back to the object pool as well as return the result to the kernel.
Outstanding asynchronous operations will also need to be tracked by Dokan so that during shutdown all pending IO operations will be verified complete before tearing down the IO completion port and the thread pool. Once a user-mode asynchronous operation completes it will need to call DokanFinishAsyncEvent()
which will cause Dokan to call the appropriate EndDispatch*()
. We should also make our thread pool and IO completion port available to user mode drivers since a thread can only be bound to a single IO completion port and there's no reason to force driver writers to create another thread pool.
@JohnOberschelp If you write so many files as you have threads, then all of them will be blocked.
@Corillian This will be an amazing improvement. I totally agree. :smiley:
I agree that the current API is weird, and doesn't make much sense for some cases (my filesystem queries the network asynchronously too, so I have to workaround it by using promises and futures, and blocking the callback in DokanOperations). But note that is pretty easy to implement a filesystem in Dokan+FUSE this way.
Also, I'm not very familiar with the Windows API, and even if I were, I will have to use both Windows' and Unix's if I had to write a cross-platform filesystem. My current approach for that is using a cross platform library that isolates me from that as much as possible.
An alternative would be to document the protocol to talk to the kernel device, am I right? FUSE is considering it and there is already a protocol sketch.
That would be helpful for people who want to use Dokan with any language.
Interesting discussion you started @Corillian. Current threading model was always a bother to me so I'm glad to see suggestions. My opinion is that it currently works and only result to performance issue so even if I totally agree, I will personally do on it after few others issues / bug fixes. Or maybe you already planned to work on it on your side as you already contributed several times? Better to know it to avoid work duplicate as happened with WiX installer :smile:.
@suy For now, most current language Dokan libraries (C#, Java, ...) are wrapping the C resulting dynamic library dokan1.dll. I currently prefer this way to avoid user-mode handling work duplicate (libraries would have to through native win32 call anyway to talk to the driver). But documenting the protocol to talk to kernel device should be done for the sake of documentation indeed!
Hi @Corillian ,
Since it is june, I just wanted to know if you got any time on this ? (No need to hurry, 1.0.0 is even not released yet :smiley: )
No unfortunately the stuff I needed to get done before I could address this ended up taking a bit longer than I had anticipated. I noticed that 1.0.0 isn't released yet so I figured the delay on my end wouldn't be a huge issue ;). I'm still planning on implementing this, hopefully soon as the lack of throughput is a near constant complaint from users of my driver.
Alright I've started working on this. I won't be working on it every day so I don't know exactly how long it will take but it's currently under way =).
@Corillian :+1: don't hesitate if you need help!
@Liryna Why is ZwCreateFile()
called twice for SL_OPEN_TARGET_DIRECTORY
?
@Corillian I don't have my environment to test with me but I would say that this come from the last pull requests of @bailey27.
If I remember, there should be a first createfile to know if the file exist and a second for opening the directory.
Or do you mean you have two time the same createfile?
@Corillian ,
I changed the behavior of SL_OPEN_TARGET_DIRECTORY to fix https://github.com/dokan-dev/dokany/issues/249 (MS Word 2007 & 2010 can not save file normally in mirror.exe).
If SL_OPEN_TARGET_DIRECTORY is specified, then the caller expects to open the parent dir of the file. But it also expects to be told in create Information if the child did not exist.
Going through directory.c it seems to me that since Dokan caches the previous directory listing and provides its own pattern search maintaining FindFilesWithPattern()
as a Dokan operation is a bad idea. @Liryna any opposition to removing it?
Also is it correct to assume that the kernel synchronizes all file operations on a file handle? If that's not true then there are quite a few synchronization bugs in the user-mode driver.
Lastly shouldn't directory list caching be done in the kernel and not the user-mode driver? I don't understand why we have to endure 2 user/kernel transitions when SL_RESTART_SCAN
isn't specified just because the list gets cached in user space instead of the kernel.
Independent where the caching is done, how is the following scenario covered here:
-
FindFiles returns File1, File2
-
The mirrored File1 is removed.
-
Only the application knows this (maybe), which uses the dokan usermode library
-
UserMode and Kernel have no idea, that the file has been removed
Actually it seems that the cache is not updated.
To make the whole system stable and faster:
-
the caching should be moved to kernel, like Keith said
-
notification mechanism should be implemented, so that the cache can be updated(update attribute, name, size, removed, etc.)
-
during the file open, if usermode reports that the files does not exists also update the cache
-
even if just some attributes of the file have change, rename, size changed, this must be reported to kernel
-
consider flag FO_NO_INTERMEDIATE_BUFFERING, FO_TEMPORARY_FILE, FO_DELETE_ON_CLOSE during the cache
-
enable/disable cache
The cache can be enabled/disabled
Only changes of files which did not happen through virtual drive must be reported.
This way the file and directory information can retrieve the information from cache. If the findfiles should be cached, then every external change on the mirrored filesystem needs to be reported to the kernel. FindFilesWithPattern is in generally not a bad idea, it simplifies the implementation. The developer don’t needs to check the mask for *
Thanks
Marinko
Ok after looking through FastFat it looks like it's up to the driver to do its own synchronization - which is what I had assumed would be the case. FastFat obviously doesn't need to do any caching because it has direct access to the drive but it does maintain the state of the previous search. It looks like FindFilesWithPattern()
is worth keeping around however I do think the cache should be maintained in the kernel and not in the user-mode driver.
Here's the FastFat code for searching:
//
// Determine where to start the scan. Highest priority is given
// to the file index. Lower priority is the restart flag. If
// neither of these is specified, then the Vbo offset field in the
// Ccb is used.
//
if (IndexSpecified) {
CurrentVbo = FileIndex + sizeof( DIRENT );
} else if (RestartScan) {
CurrentVbo = 0;
Ccb->OffsetToStartSearchFrom = 0;
} else {
CurrentVbo = Ccb->OffsetToStartSearchFrom;
}
As @marinkobabic mentioned there is currently a problem with refreshing the cache. As far as I'm aware Dokan does not currently provide a driver with a way to notify the file system that the file system has changed?
I agree that the cache should probably review/rewrite and move to the kernel. Notification of file changes are made during CreateFile and SetInformation when I file is created, delete, .... I would say that this part works because for exemple we see explorer being directly aware of a new file and show it https://github.com/dokan-dev/dokany/blob/84c38f8eb48ee703fd3eaa18abed09ab1072d029/sys/create.c#L1300 https://github.com/dokan-dev/dokany/blob/b845c086c96e4b6125d0c729086ca122bb8b393c/sys/fileinfo.c#L598
Yes but say your driver is a remote FTP server - when a new file gets created on that server there's no way to notify Dokan and Windows Explorer won't be aware of the new file until the next time it happens to issue a request for a directory listing.
Yes, in such case we will have an issue :cry: but the cache would also not help ?
The cache helps but it solves a separate problem - the OS can submit multiple partial requests for the directory listing of an identical search result. The cache won't be purged until different search criteria are provided, thereby invalidating the cache, or the SL_RESTART_SCAN
flag gets set.
If a the file system changes out from underneath Dokan, such is possible when monitoring a remote FTP, the user-mode driver needs a way to tell Explorer than the file system has changed so that Explorer knows to issue a new FindFiles()
with SL_RESTART_SCAN
.
I would suggest to create a separate issue related to cache. This should not be related to your actual implementation?
Yeah I'm not moving any of the cache stuff to the kernel for this issue, I'll create a separate issue.
In setfile.c:
case FilePositionInformation:
// this case is dealed with by driver
status = STATUS_NOT_IMPLEMENTED;
break;
Which driver is that referring to?
This code should never be called because FilePositionInformation
is directly handled in the dokan driver and never forwarded to user land https://github.com/dokan-dev/dokany/blob/b845c086c96e4b6125d0c729086ca122bb8b393c/sys/fileinfo.c#L134
Ok cool, ty