rs-matter
rs-matter copied to clipboard
In-place update
This PR introduces an efficient FromTLV::update_from_tlv method with a default implementation that:
- Turns
FromTLV::init_from_tlvinto an infallible initialiser - Does a check - using the also newly introduced
FromTLV::check_tlv(which also has a default implementation) before attempting the update, so as to make sure, that the initializer can indeed be safely turned into an infallible one (i.e. it will not fail in the middle due to a malformed TLV stream)
Note that FromTLV::update_from_tlv assumes that FromTLV::check_tlv would be idempotent - i.e. calling it multiple times on the same TLVElement would yield identical results. However, that should be indeed the case anyway.
The benefit of having FromTLV::update_from_tlv is that it gives us cheap, transactional in-place updates, where either a value is in-place updated with the initializer completely, or the value is not updated at all. Basically, an "in-place" equivalent of:
fn update(&mut self, element: &TLVElement) -> Result<(), Error> {
let new = Self::from_tlv(element)?;
*self = new;
Ok(())
}
The problem with the above update method, and the reason why we need FromTLV::update_from_tlv is - as usual - stack memory; i.e. it is transactional w.r.t. updating the value, but not cheap. The problem is, the new value is first materialized on-stack, and only after that it is moved to its final location (*self) - a behavior, which rustc might or might not optimize.
While we now have FromTLV::init_from_tlv which avoids the on-stack materialization of FromTLV::from_tlv, we don't have any "update" equivalent to FromTLV::init_from_tlv, which this PR introduces.
NOTE: This PR is deliberately in Draft and will stay in such, until I can discuss with @y86-dev (the author of the pinned-init crate and a RfL contributor) that the crate::utils::init::ApplyInit extension trait introduced with this PR is sound.
Review changes with SemanticDiff.
Analyzed 8 of 8 files.
Overall, the semantic diff is 5% smaller than the GitHub diff.
| Filename | Status | |
|---|---|---|
| :heavy_check_mark: | rs-matter-macros-impl/src/tlv.rs | 8.86% smaller |
| :heavy_check_mark: | rs-matter/src/acl.rs | Analyzed |
| :heavy_check_mark: | rs-matter/src/fabric.rs | 3.6% smaller |
| :heavy_check_mark: | rs-matter/src/utils/init.rs | 4.51% smaller |
| :heavy_check_mark: | rs-matter/src/tlv/traits.rs | Analyzed |
| :heavy_check_mark: | rs-matter/src/tlv/traits/array.rs | Analyzed |
| :heavy_check_mark: | rs-matter/src/tlv/traits/octets.rs | Analyzed |
| :heavy_check_mark: | rs-matter/src/tlv/traits/vec.rs | Analyzed |
@y86-dev Sorry for pinging you here and sorry to ask - but - would you be willing to review the implementation of the crate::utils::init::ApplyInit extension trait, as I mentioned in this PR description?
You don't need to review anything else. The whole other machinery introduced with this PR lays on the assumption, that what ApplyInit::apply does is safe.
To translate a bit in English what ApplyInit::apply does: essentially it transforms any unwinding panic into an abort, because - in my opinion - in-place updates of an already initialized value are fundamentally incompatible with unwinding panics, as that would lead to the compiler attempting to drop an already dropped value.
Also - and if I understand your in-place initialization correctly - you don't have any such problem with the in-place initialization code in pinned-init, because - in the presence of unwinding panics - the worst that can happen is memory leaks, as the compiler will not drop your partially-initialized value. Which is OK.
It is another topic that you probably don't care that much about unwinding panics in pinned-init, as this code works in the kernel, where - I suppose? - rustc operates with panic_abort.
Same as we do for our main use case, which is running rs-matter on MCUs, where the only feasible strategy - due to firmware size - is panic_abort as well.
Yet, I would like to make sure, that rs-matter would behave (by promoting unwind to abort) in STD-land as well.
@y86-dev Sorry for pinging you here and sorry to ask - but - would you be willing to review the implementation of the
crate::utils::init::ApplyInitextension trait, as I mentioned in this PR description?
No worries for pinging me. I will take a look, but I am rather busy at the moment. I might be able to get to it in the next week or so.
@y86-dev Sorry for pinging you here and sorry to ask - but - would you be willing to review the implementation of the
crate::utils::init::ApplyInitextension trait, as I mentioned in this PR description?No worries for pinging me. I will take a look, but I am rather busy at the moment. I might be able to get to it in the next week or so.
Thank you. Next week is fine. This PR is not gate-keeping any other development ATM.
(Just to mention that I'll later probably switch the "abort-on-unwinding-panic" code to use this clever trick, but that should be an insignificant change w.r.t. the logic of ApplyInit::apply.)
I left some comments about the implementation, overall it looks sensible.
Thank you.
In Rust for Linux, we have the need for
InPlaceWrite<T>I haven't yet added it to the github repo, but at some point I might get to it. You can use that trait to initialize a smart pointer that containsMaybeUninit<T>.
This means I could retire my own InitMaybeUninit creation.
Note that we don't really use smart pointers at all, because the whole rs-matter codebase is no_std and no-alloc, so our (few) use cases where we have to initialize a user-passed MaybeUninit<T> are around in-place initialization of &mut MaybeUninit<T> rather than e.g. Box<MaybeUninit<T>>.
One danger with in-place initializing &mut MaybeUninit<T> is that the return type would be &mut T. In other words, even though the slot was initialized, the compiler does not remember that (as there is no SmartPointer<MaybeUninit<T>> to SmartPointer<T> transformation) and hence you need to manually drop your memory if the &mut reference is not 'static (which however it is usually, in .bss).
I took a quick look at the use case of
Apply::applyand what I find a bit odd, is how theFromTLV::update_from_tlvfunction is implemented. The whole "check if it is valid and then convert the fallible initializer to an infallible one" seems very strange. Maybe the correct return type ofinit_from_tlvshould actually beResult<impl Init<Self>>?
Regardless which signature is used, the "check if it is valid before trying to init" is unavoidable. What is also unavoidable, is that if the initialization (after the check is already done) fails mid-way (it should not if the check is idempotent but then you never know) we can only turn the error into a panic and then any unwinding panic into a hard abort.
The above we have to do regardless of the signature, so I don't think your Result<impl Init<Self>> suggestion would function any differently than the PR's impl Init<Self, E>?
I actually did start with Result<impl Init<Self>> but than had to convert to impl Init<Self, E> because somehow I had use cases where the latter was more ergonomic. I'll check this again.
sidenote: the
FromTLV::init_from_tlvfunction is probably not working as expected:fn init_from_tlv(element: TLVElement<'a>) -> impl init::Init<Self, Error> { unsafe { init::init_from_closure(move |slot| { core::ptr::write(slot, Self::from_tlv(&element)?); Ok(()) }) } }The in-place initializer will overflow the stack, if the size of
Selfis too big. That's because the call toSelf::from_tlvputs a temporary value on the stack.
Agree completely!
The thing is, this implementation is only a default one. FromTLV implementors that are potentially large, like Vec, Maybe - my home-grown in-place-initable version of Option, OctetsOwned and all proc-macro generated FromTLV implementations do override it with a memory-efficient and correct variant (I hope).
Also, this function could take
elementby reference instead of by value, then there would be no need to clone it inupdate_from_tlv.
I would turn it the other way around: update_from_tlv should take the element by value rather than by reference. The thing is, the (anonymous) lifetime 't in &'t TLVElement<'a> in our code is often shorter than 'a and then the initializers cannot outlive it, but very often in our codebase they have to, because the initialization often gets delayed until after 't is no longer valid.
In any case, the point is moot because TLVElement<'a> is just a repr(transparent) for &'a [u8] so I could just as well implement Copy for it. :-)
Thanks again for taking the time to review - let me look at the inline comments as well now...
In Rust for Linux, we have the need for
InPlaceWrite<T>I haven't yet added it to the github repo, but at some point I might get to it. You can use that trait to initialize a smart pointer that containsMaybeUninit<T>.This means I could retire my own
InitMaybeUninitcreation. Note that we don't really use smart pointers at all, because the wholers-mattercodebase isno_stdandno-alloc, so our (few) use cases where we have to initialize a user-passedMaybeUninit<T>are around in-place initialization of&mut MaybeUninit<T>rather than e.g.Box<MaybeUninit<T>>.One danger with in-place initializing
&mut MaybeUninit<T>is that the return type would be&mut T. In other words, even though the slot was initialized, the compiler does not remember that (as there is noSmartPointer<MaybeUninit<T>>toSmartPointer<T>transformation) and hence you need to manually drop your memory if the&mutreference is not'static(which however it is usually, in.bss).
You could create a smart pointer type that has its backing memory in the .bss section. Dropping that smart pointer only drops the value, but doesn't actually free the memory (you can't access it though).
I took a quick look at the use case of
Apply::applyand what I find a bit odd, is how theFromTLV::update_from_tlvfunction is implemented. The whole "check if it is valid and then convert the fallible initializer to an infallible one" seems very strange. Maybe the correct return type ofinit_from_tlvshould actually beResult<impl Init<Self>>?Regardless which signature is used, the "check if it is valid before trying to init" is unavoidable. What is also unavoidable, is that if the initialization (after the check is already done) fails mid-way (it should not if the check is idempotent but then you never know) we can only turn the error into a panic and then any unwinding panic into a hard abort.
I meant that you do the check twice, since first you check and return if it errors and then you do the check again, but panic if the result isn't Ok. It would be better if there is only one check and no result discarding.
The above we have to do regardless of the signature, so I don't think your
Result<impl Init<Self>>suggestion would function any differently than the PR'simpl Init<Self, E>?I actually did start with
Result<impl Init<Self>>but than had to convert toimpl Init<Self, E>because somehow I had use cases where the latter was more ergonomic. I'll check this again.
That depends on when you know that the initializer will fail. If you can do the check without having the resulting memory on hand, then you can use Result<impl Init<Self>>. Otherwise you need that the initializer returns an error.
sidenote: the
FromTLV::init_from_tlvfunction is probably not working as expected:fn init_from_tlv(element: TLVElement<'a>) -> impl init::Init<Self, Error> { unsafe { init::init_from_closure(move |slot| { core::ptr::write(slot, Self::from_tlv(&element)?); Ok(()) }) } }The in-place initializer will overflow the stack, if the size of
Selfis too big. That's because the call toSelf::from_tlvputs a temporary value on the stack.Agree completely! The thing is, this implementation is only a default one.
FromTLVimplementors that are potentially large, likeVec,Maybe- my home-grown in-place-initable version ofOption,OctetsOwnedand all proc-macro generatedFromTLVimplementations do override it with a memory-efficient and correct variant (I hope).Also, this function could take
elementby reference instead of by value, then there would be no need to clone it inupdate_from_tlv.
That's what I thought. It might be easier to only have one initialize/new function though (note that any T implements impl Init<T>, so you can just return the value directly).
I would turn it the other way around:
update_from_tlvshould take the element by value rather than by reference. The thing is, the (anonymous) lifetime'tin&'t TLVElement<'a>in our code is often shorter than'aand then the initializers cannot outlive it, but very often in our codebase they have to, because the initialization often gets delayed until after'tis no longer valid.
Ah yeah, then you have to take it by value.
In any case, the point is moot because
TLVElement<'a>is just arepr(transparent)for&'a [u8]so I could just as well implementCopyfor it. :-)
Oh! Yeah probably makes sense to only use it by-value then and also implement copy :)
Thanks again for taking the time to review - let me look at the inline comments as well now...
Glad I could help :)
I took a quick look at the use case of
Apply::applyand what I find a bit odd, is how theFromTLV::update_from_tlvfunction is implemented. The whole "check if it is valid and then convert the fallible initializer to an infallible one" seems very strange. Maybe the correct return type ofinit_from_tlvshould actually beResult<impl Init<Self>>?Regardless which signature is used, the "check if it is valid before trying to init" is unavoidable. What is also unavoidable, is that if the initialization (after the check is already done) fails mid-way (it should not if the check is idempotent but then you never know) we can only turn the error into a panic and then any unwinding panic into a hard abort.
I meant that you do the check twice, since first you check and return if it errors and then you do the check again, but panic if the result isn't
Ok. It would be better if there is only one check and no result discarding.
I'll address only ^^^ as it is the most important comment of all.
I don't think a single check in T::update_from_tlv is possible at all.
Suppose I remove the call to Self::check_from_tlv(element)?; and suppose that a concrete TLV stream inside TLVElement happens to be erroneous (invalid).
What would then happen when I call Self::init_from_tlv(element.clone()).into_infallible().apply(self); is that it will panic, because the TLV stream contains errors.
But we don't want that!
We want the update to fail if the TLV stream contains errors. OK then, can I just NOT convert the initializer to "infallible" and return the error?
And I think the answer is no.
The reason why the answer is no is because if the stream is erroneous - in the absence of our initial "pre-check" of the stream, the parsing WILL fail mid-way while we are initializing the slot. No problem you would say - the initializer will drop in place whatever was already initialized. Sure, but when calling update_from_tlv, we are not initializing a MaybeUninit! We are (re-)initializing an already initialized memory. That's by the way the reason why we need the unsafe drop_in_place before the (re)initialization.
So where I'm going with that is that if we get a failure during the initialization, that should absolutely be an immediate abort, or else - even if your initializer is cleaning up the half-initialized memory behind itself - the rest of the code does not know that we have called drop_in_place and it will STILL try to free the slot in case we don't abort. And that would be a double-free because of our explicit drop_in_place.