nocache
nocache copied to clipboard
turn off the cache for a directory
This rather is a support request. Can I disable the cache for a specific directories? Benefit is as follows. A program transmits big files over a network and stores hosts' metadata in the metadata file. It's better not to disable cache for the metadata file.
A more generic solution would be to introduce a method to either include (“whitelist”) or exclude (“blacklist”) certain glob patterns. The problem here is that the only way you can configure nocache’s behavior is by specifying environment variables, so this approach would mean you have to: think of two good variable names, parse their contents into an array or a list and then match every open() call against each expression. Feel free to implement this if you need it; I’ll take a look at it, but I currently don’t have the time to do it myself.
The problem here is that the only way you can configure nocache’s behavior is by specifying environment variables
Why so? There are commands which accept options and a command, for example, "nice", "sudo", "env", "time", "xargs".
True; but the functionality of nocache
is achieved by the wrapper shell script setting the LD_PRELOAD
env variable. The initializer of nocache.so
is only called from the specified executable and has no access to command line arguments. See for example how the -n
option is implemented.
A more generic solution would be to introduce a method to either include (“whitelist”) or exclude (“blacklist”) certain glob patterns.
A wildcard never matches the pathname separator, so how do I specify all descendants of a directory by a glob pattern?
Use fnmatch(3)
without FNM_PATHNAME
.
$ cat fnmatch.c
#include <fnmatch.h>
int main(int argc, char *argv[]) {
return fnmatch("foo/*", argv[1], 0);
}
$ gcc -Wall -o fnmatch fnmatch.c
$ ./fnmatch foo/bar/baz && echo it matches
it matches
Well, I implemented this feature request in my fork. I decided to use POSIX Extended Regular Expressions because they are more straightforward and more powerful than glob patterns. What do you think?
Because the library remembers which pages (ie., 4K-blocks of the file) were already in file system cache when the file was opened, these will not be marked as "don't need", because other applications might need that, although they are not actively used (think: hot standby).
I don't understand this. Do you think that OS uses the last suggestion instead of joining suggestions from all processes?
I’ve added some comments to your commit https://github.com/beroal/nocache/commit/c3956d384d04837dc33dc1756dd7e73754aae919
Do you think that OS uses the last suggestion instead of joining suggestions from all processes?
The reality is a bit more complicated, but in principle, yes. If process A reads file X completely it’s in the FS cache; if B now maps X and does an fadvise with “don’t need” on the file descriptor, the contents are evicted from the cache; subsequent reads of A from X will require going back to the storage medium to retrieve data.
In other words: Without this mechanism, you might evict files that are in active use, thereby impacting other processes.
Then the Linux kernel is kind of stupid.
Regarding maybe_store_pageinfo
. All my additions contain cond
or pattern
. I group code by keywords. Other suggestions are implemented.
Documentation. The cache is disabled for a file iff (I and not E) where I iff the file name satisfies the environment variable NOCACHE_PATTERN_INCLUDE (default: true), E iff the file name satisfies the environment variable NOCACHE_PATTERN_EXCLUDE (default: false). Both variables are treated as POSIX Extended Regular Expressions.
I left some comments on https://github.com/beroal/nocache/commit/1e6061c9879b21f1d22607cc5d783f5abdf20a3f again.
Then the Linux kernel is kind of stupid.
Yes, and you’re welcome to improve it. The code is in mm/fadvise.c
. Beware though that good and robust cache invalidation is one of the harder problems in programming.
Documentation.
Can you please add command line options to the nocache
shell wrapper and add documentation to the Readme?
I’d make explicit what you expect, i.e. if(regcomp(…) != NULL)
Look at the type of regcomp
.
Can you please add command line options to the nocache shell wrapper and add documentation to the Readme?
Sorry, I don't know the Bash programming language and I'm happy with that. ;-)