linux
linux copied to clipboard
Domain hash table
To make it simpler, a Landlock domains is currently a landlock_ruleset
struct. The use of this data structure includes fields which are useless for a domain, and a red-black tree which is not useful nor optimized for a read-only data structure.
To both improve performance and test coverage (by removing some WARN_ON_ONCE
checks), we should create a dedicated domain data type, mainly consisting of a hash table. We should be able to use hash_long()
.
We should first create a tool to benchmark domain use and measure the performance improvement related to this new data type, see #24.
Bitwise lookup tables are potentially also an option for some more advanced features: https://lore.kernel.org/all/[email protected]/
I have been wondering about the design of this in the past as well; the existing approach seems focused on keeping the worst case behaviour in check, but most Landlock rulesets that I have seen so far are actually pretty simple -- it makes me wonder whether we should not also take average case performance into account separately from the worst case performance.
In my mind a more "obvious" implementation approach for these domains would be to create one kernel-internal ruleset structure with these RB-trees, and just link it to its parent ruleset instead of merging this.
As rulesets are always created in a nested fashion, this will form a tree of rulesets and we can use reference counting for memory management.
- It becomes cheaper to enforce rulesets (merge is just setting the "parent" link)
- The access right lookup would need to traverse to the top of the linked tree of linked rulesets
- (See below for why I think this is not as bad as it sounds)
- The nodes can store access_mask_t again and we'd use fewer loops to do bit-per-bit work
- A maximum nesting depth could still be enforced to avoid performance problems.
- The implementation would be simpler. (That alone might lead to savings.)
I suspect that this might work well in the average case, because as far as I have seen, in realistic scenarios, the stacked rulesets seldomly refer to the same inodes. So if during a merge operation, these inode entries don't actually collapse across the merged rulesets, we just end up with multiple RB-trees that have the same overall number of entries as the big one that we are merging together in the existing implementation.
If we move to hash tables instead of RB-trees, linear scanning can be a valid performance optimization for small hash table sizes.
There are too many variables here to reason about it with confidence. I have a hard time proving that approach we have fixed bug 24 :)
Has this or a similar approach been discussed before for Landlock, maybe in one of the earlier patch sets?
Bitwise lookup tables are potentially also an option for some more advanced features: https://lore.kernel.org/all/[email protected]/
Indeed
I have been wondering about the design of this in the past as well; the existing approach seems focused on keeping the worst case behaviour in check, but most Landlock rulesets that I have seen so far are actually pretty simple -- it makes me wonder whether we should not also take average case performance into account separately from the worst case performance.
Sure, any idea?
In my mind a more "obvious" implementation approach for these domains would be to create one kernel-internal ruleset structure with these RB-trees, and just link it to its parent ruleset instead of merging this.
As rulesets are always created in a nested fashion, this will form a tree of rulesets and we can use reference counting for memory management.
I'm not sure to follow.
My reasoning to merge rulesets was to get a good locality for rules of the same domain. It takes more memory (by duplicating data) but I think it is not a big issue because rulesets are usually not so big and quick lookup is more valuable.
The landlock_hierarchy
tree is use to keep track of domain hierarchies.
Has this or a similar approach been discussed before for Landlock, maybe in one of the earlier patch sets?
Not much. It was just mentioned that a hash table might be a good idea.
@l0kod, what do you think about removing all layer-related logic for rules with non-hierarchical keys? That is, layers are currently required only to check access for some path (when we do pathwalk through all parent inodes, collecting access bits for each layer). For example, there is no need to keep layers for net rules, we can keep only one actions bitmask for each port.
@l0kod, what do you think about removing all layer-related logic for rules with non-hierarchical keys? That is, layers are currently required only to check access for some path (when we do pathwalk through all parent inodes, collecting access bits for each layer). For example, there is no need to keep layers for net rules, we can keep only one actions bitmask for each port.
This is indeed not required to check an network port access but we need to store access rights per layer to be identify the layer that denied an access request. This is required for audit support to include in the logs the layer level that denied a request.