VDKQueue icon indicating copy to clipboard operation
VDKQueue copied to clipboard

Don't poll every second for kevents

Open gobbledegook opened this issue 2 years ago • 11 comments

In the watcher thread there is this comment:

struct timespec timeout = { 1, 0 }; // 1 second timeout. Should be longer, but we need this thread to exit when a kqueue is dealloced, so 1 second timeout is quite a while to wait.

Polling kqueue once a second is inefficient and kind of seems to defeat the purpose of using kqueue in the first place. The correct way to do this is to basically send a signal telling the thread to terminate. I believe the method here (see the second commit in this pull request) takes care of this, by registering for, then triggering a "user" event.

Hopefully this is useful for someone out there!

(@Coeur I realize this repo isn't maintained anymore, but it's easier for me to post this here than having to clone the entire source of Transmission)

gobbledegook avatar Dec 25 '23 09:12 gobbledegook

I see, good idea to have a custom event to stop observing instead of polling all the time. Although, in the implementation, I'll prefer to make use of the existing removeAllPaths instead of introducing an extra method.

Coeur avatar Dec 28 '23 22:12 Coeur

Hey guys. Thanks for the all the activity. I'm happy to merge when I have time, but FSEvents is a much more performant, modern way to watch for file activity. I dropped kQueues from my apps many years ago when Apple finally fixed the huge FSEvents bug in OS X 10.11, El Capitan.

If it's possible for you to migrate to FSEvents, I highly encourage that over using kQueues.

bdkjones avatar Dec 28 '23 22:12 bdkjones

Hi @bdkjones . Well, I added a lengthy comment to the Transmission fork of VDKQueue about a year ago regarding newer alternatives: https://github.com/transmission/transmission/blob/main/macosx/VDKQueue/VDKQueue.h

#warning Adopt an alternative to VDKQueue (UKFSEventsWatcher, EonilFSEvents, FileWatcher, DTFolderMonitor or SFSMonitor) // ALTERNATIVES (from archaic to modern) // // - FreeBSD 4.1: Kernel Queue API (kevent and kqueue) // (https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/kqueue.2.html) // // Example: SKQueue (https://github.com/daniel-pedersen/SKQueue) but claimed to crash and be superseded by SFSMonitor (https://stackoverflow.com/a/62167224) // // - macOS 10.1–10.8: FNSubscribe and FNNotify API // (https://developer.apple.com/documentation/coreservices/1566843-fnsubscribebypath) // "the FNNotify API has been supplanted by the FSEvents API" // (https://github.com/phracker/MacOSX-SDKs/blob/master/MacOSX10.7.sdk/System/Library/Frameworks/AppKit.framework/Versions/C/Headers/NSWorkspace.h) // // - macOS 10.5+: File System Events API (FSEventStreamCreate) // (https://developer.apple.com/documentation/coreservices/file_system_events) // "File system events are intended to provide notification of changes with directory-level granularity. For most purposes, this is sufficient. In some cases, however, you may need to receive notifications with finer granularity. For example, you might need to monitor only changes made to a single file. For that purpose, the kernel queue (kqueue) notification system is more appropriate. // If you are monitoring a large hierarchy of content, you should use file system events instead, however, because kernel queues are somewhat more complex than kernel events, and can be more resource intensive because of the additional user-kernel communication involved." // (https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/FSEvents_ProgGuide/KernelQueues/KernelQueues.html) // // Example: UKFSEventsWatcher (https://github.com/uliwitness/UKFileWatcher) // Example: EonilFSEvents (https://github.com/eonil/FSEvents) // Example: FileWatcher (https://github.com/eonist/FileWatcher) // // - macOS 10.6+: Grand Central Dispatch API to monitor virtual filesystem nodes (DISPATCH_SOURCE_TYPE_VNODE) // (https://developer.apple.com/documentation/dispatch/dispatch_source_type_vnode) // "GCD uses kqueue under the hood and the same capabilities are made available." // (https://www.reddit.com/r/programming/comments/l6j3g/using_kqueue_in_cocoa/c2q74yy) // // Example: RSTDirectoryMonitor (https://github.com/varuzhnikov/HelloWorld) but unmaintained as a standalone project (abandoned 2013) // Example: DirectoryMonitor (https://github.com/robovm/apple-ios-samples/blob/master/ListerforwatchOSiOSandOSX/Swift/ListerKit/DirectoryMonitor.swift) but unmaintained (abandoned 2016) // Example: TABFileMonitor (https://github.com/tblank555/iMonitorMyFiles/tree/master/iMonitorMyFiles/Classes) but unmaintained (abandoned 2016) // Example: DTFolderMonitor (https://github.com/Cocoanetics/DTFoundation/tree/develop/Core/Source) // // - macOS 10.7+: NSFilePresenter API // (https://developer.apple.com/documentation/foundation/nsfilepresenter?language=objc) // "They're buggy, broken, and Apple is haven't willing to fix them for last 4 years." // (https://stackoverflow.com/a/26878163) // // - macOS 10.10+: DispatchSource API (makeFileSystemObjectSource) // (https://developer.apple.com/documentation/dispatch/dispatchsource/2300040-makefilesystemobjectsource) // // Example: SFSMonitor (https://github.com/ClassicalDude/SFSMonitor)

It's a bit hard to tell which ones are real improvement: a. kqueue (that's what VDKQueue uses) b. FSEvents (comment says it's better for "large hierarchy of content" but not for single files) c. GCD (comment says it uses kqueue under the hood) d. NSFilePresenter (comment says it's broken and unmaintained by Apple) e. DispatchSource (newest in the family of solutions, but Swift only)

Coeur avatar Dec 28 '23 22:12 Coeur

@Coeur Yea. VDKQueue is a fork of UKKQueue, which I used back in 2008. I didn't write most of this; I just tweaked a few parts that I needed for my apps back then.

I can tell you, definitively, that FSEvents works great for watching single files. I use it like that to watch literally hundreds of thousands of files and it's been flawless and easy to manage.

bdkjones avatar Dec 28 '23 22:12 bdkjones

(before I forget, my original patch deleted _keepWatcherThreadRunning=NO, but you actually still need that line or else the thread will never start back up again when you add new files)

gobbledegook avatar Dec 28 '23 22:12 gobbledegook

Regarding the alternatives, GCD and DispatchSource are really the same thing (DispatchSource is just the Swift version of gcd dispatch sources), and yes they use kqueue under the hood. NSFilePresenter, as I understand it, only works with other apps that adopt NSFilePresenter, so that's a no-go. So that just leaves FSEvents.

In my project I actually am using both kqueue and FSEvents. FSEvents is especially useful with the kFSEventStreamCreateFlagFileEvents flag, which will, for example, give you individual events if files get added to the directory you're watching. This definitely beats using kqueue and having to compare before-and-after snapshots of your directories.

On the other hand, FSEvents makes you declare the list of directories that you're interested in when you create the stream. You can't modify the list of directories; you'd have to destroy the current stream and create a new one with an updated array of paths, or create a new stream with the new paths that you're interested in. So conceptually it's a bit different from kqueue, where you can set up one fildes and add individual paths (whether files or directories) as you go along. @bdkjones when you're watching hundreds of thousands of files are you creating hundreds of thousands of FSEventStreamRef's? Or do you mean you're watching one directory with hundreds of thousands of files in it?

I'm also curious as to why you say FSEvents is more performant... I can't seem to find anything online that compares the performance of it vs kqueue. How are you measuring performance here?

gobbledegook avatar Dec 28 '23 23:12 gobbledegook

I create one stream with multiple folders containing all the files I’m interested in. It’s more performant because there’s one fewer process constantly switching between kernel and user space, which is expensive. It also does not involve polling, which is much more power efficient on portables. Finally, the FSEvents daemon is optimized by Apple and a single process supplies information to every subscriber—less overhead, fewer resources used.On Dec 28, 2023, at 18:29, Dominic Yu @.***> wrote: Regarding the alternatives, GCD and DispatchSource are really the same thing (DispatchSource is just the Swift version of gcd dispatch sources), and yes they use kqueue under the hood. NSFilePresenter, as I understand it, only works with other apps that adopt NSFilePresenter, so that's a no-go. So that just leaves FSEvents. In my project I actually am using both kqueue and FSEvents. FSEvents is especially useful with the kFSEventStreamCreateFlagFileEvents flag, which will, for example, give you individual events if files get added to the directory you're watching. This definitely beats using kqueue and having to compare before-and-after snapshots of your directories. On the other hand, FSEvents makes you declare the list of directories that you're interested in when you create the stream. You can't modify the list of directories; you'd have to destroy the current stream and create a new one with an updated array of paths, or create a new stream with the new paths that you're interested in. So conceptually it's a bit different from kqueue, where you can set up one fildes and add individual paths (whether files or directories) as you go along. @bdkjones when you're watching hundreds of thousands of files are you creating hundreds of thousands of FSEventStreamRef's? Or do you mean you're watching one directory with hundreds of thousands of files in it? I'm also curious as to why you say FSEvents is more performant... I can't seem to find anything online that compares the performance of it vs kqueue. How are you measuring performance here?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

bdkjones avatar Dec 28 '23 23:12 bdkjones

Got it. Yes, in your case (one list of directories where you're interested in everything in them) FSEvents is clearly the superior choice. I think kqueue is still more appropriate for certain use cases, e.g., if you're interested in a number of specific paths that may change over time (and if you're interested in specific files, rather than directories, kqueue is your only option). The polling problem is basically eliminated with this pull request, though.

gobbledegook avatar Dec 28 '23 23:12 gobbledegook

@gobbledegook I wrote a possibly simplified version of your idea, by sending only a single EV_ADD | EV_ONESHOT: https://github.com/transmission/transmission/pull/6452

Coeur avatar Dec 29 '23 02:12 Coeur

ooh I like it!

gobbledegook avatar Dec 29 '23 02:12 gobbledegook

So, I decided to take steps and maintain a fork of VDKQueue: https://github.com/coeur/VDKQueue

It has mostly all my pull requests included, and gobbledegook suggestions, and warnings/format fixed, and readme documented with the alternatives like FSEvents.

Coeur avatar Feb 15 '24 23:02 Coeur