iree
iree copied to clipboard
Add shared pooling for `IREE_HAL_BUFFER_USAGE_CONSTANT` buffers.
Via a runtime system to allow for multiple instances of the same program to share constants. The complication with implicit sharing is that we only want two of the same programs to share and at the HAL level we don't know this information. With explicit sharing we would have some kind of share group that let the application decide which programs are safe to share. We'll want to figure out the ergonomics such that sharing is difficult enough to use that it won't be misused.
Another complication is that we want to be able to do cache lookups without needing to touch the data - this is because the data may be expensive to load (100's of MB on disk) or expensive to generate (lots of compute) - and some kind of hash/ID is needed that we can embed in the file. We could use deterministic keys like a SHA-256 hash of the contents but for generated constants would need to use something like const-eval to get the data to be able to compute the hash on. One the key is available we can embed that in the file as part of a hal.allocator.try_*
op that performs lookup and otherwise have a failure path for generating/uploading and inserting.
There's potential racy behavior if multiple instances are initializing simultaneously, however running each module initialization to completion in sequence feels like reasonable startup behavior and would fully order things.
This will mostly benefit remote devices (GPUs/sandboxes/etc) as the CPU side just maps the memory from the module and the overhead per instantiated copy of the constant data is on the order of 128 bytes.
- [ ] Define a key mechanism for the constant data.
- [ ] Add share group/allocator sharing/
iree_hal_constant_pool_t
for runtime tracking. - [ ] Add
hal.allocator.try_
lookup op (& matching insert) and plumb through to the runtime module. - [ ] Expose share group mechanism to the device/driver API.