Statiq.Web icon indicating copy to clipboard operation
Statiq.Web copied to clipboard

Parallel module execution

Open daveaglick opened this issue 7 years ago • 3 comments

This idea has been bouncing around in my head for months, figured it was time to get the details down. It stems from performance benchmarking I've done that shows our biggest bottleneck is disk I/O and that we often get held up in the WriteFiles module just writing to disk. If we could also be executing other modules during that time while the processor is underutilized there's potential for huge performance gains.

Here's the idea - modules should be changed to process documents in parallel if possible. Obviously this won't be possible for some modules like GroupBy where it needs to know all the documents to form groups, but other modules like Markdown could process an input document and then send it to the next module while it concurrently handles the next input document.

I suspect this idea will result in two breaking changes:

  • We'll need to pass something other than an IReadOnlyList to the module execute method. It'll need to be some sort of structure that a module can use to grab a document, process it, and return results in parallel. Some experimentation is needed to see what'll work here.
  • Another break will be that we'll no longer have access to completed documents at each stage within a pipeline. I haven't generally used this feature and instead prefer multiple pipelines, but it'll become unavailable for everyone. If you attempt to get the documents for a pipeline before all the modules are done it'll be empty or throw. This does have the side benefit of reducing memory consumption though since documents can be released and collected as soon at they're no longer needed.

With regard to parallel pipeline execution in #189, this is not mutually exclusive. A pipeline can execute parallel modules without being concurrent with other pipeline or multiple concurrent pipelines can execute multiple concurrent modules.

daveaglick avatar Mar 24 '17 13:03 daveaglick

Another thought that would improve throughput without impacting the current API is something like a dirty cache. Under the Linux and Windows kernel, writes and reads are cached in memory to be flushed or deleted.

Wyam could have a similar strategy to flush all file writes at the end of a processing run (or when specified, or on a separate non-blocking thread). To maintain compatibility this system would also have to cache reads and understand multiple writes to the same file.

https://www.thomas-krenn.com/en/wiki/Linux_Page_Cache_Basics

Silvenga avatar Apr 10 '17 03:04 Silvenga

@Silvenga That's a really interesting idea! So the concept would be to hold off on doing anything directly to the file system until a pipeline or execution is complete or some other condition is met (maybe exceeding a certain number of operations) and then do the operations in the background on a separate thread while main execution continues.

Fortunately the consumers (modules, config files, etc.) are already working with an abstraction, so they wouldn't necessarily need to change. We'd need to keep track of all the I/O changes instead of immediately performing them, but that's certainly possible.

Playing off of this idea, I wonder if there might be a good middle-ground implementation. Instead of performing every change in a virtual file system and then flushing it at some interval, what if we just buffered I/O operations and then had a separate I/O thread dedicated to continually picking up queued I/O operations and applying them. We'd have to be careful to ensure consistency when a subsequent operation needs data from one that's queued, which would probably mean blocking in some cases. I.e. if one module writes to file A and then another one tries to read file A, the second read operation would need to block until the write operation was complete This part could get tricky when globs and searching is required - for example, a module writes file A and then a glob is used to find all files in a certain folder, one of which would be file A, but it hasn't actually be written yet.

An idea like the original Linux-style dirty cache or queued I/O operations is certainly something to consider further.

daveaglick avatar Apr 10 '17 13:04 daveaglick

Playing off of your idea, similar to how Linux does it again, Wyam could have multiple filesystem modes that is configured by either the pipeline or the user e.g.

  • sync-all
    • Flush writes immediately, blocking execution
    • Slow, but safe
    • Could be a sane default
  • async-write sync-read
    • Flush writes on a separate thread, but wait for all outstanding IO threads to complete before a read request
    • Safe as long as only the ReadFile module reads from the output
  • async-write async-read
    • Writes occur on a separate IO threads
    • Best performance
    • Might read before write's occur

Linux doesn't try to handle every user scenario, rather Linux gives the developer the ability to make the best choice. I could envision configurations to look something like:

Global on the pipeline level (making stuff up):

engine.Settings["IoMode"] = IoMode.AsyncAll;

Overrides per IO request:

engine.Pipelines.Add("WriteFiles",
    new ReadFiles(ctx => "*.html")
        .Sync(),
    new MinifyHtml(),
    new WriteFiles(".html")
        .Sync()
);

Or if the user wants to override it:

wyam --IoMode=AsyncAll

Silvenga avatar Apr 10 '17 16:04 Silvenga