Own implementation of Bech32::u5 type, upgrade of bech32 dependency
Own implementation of u5 type and conversions.
Upgrade of bech32 dependency, from 0.9.1 to 0.11.0.
Fixes #3176 and #3195 . Obsoletes #3181 .
Changes summary:
- Introduced
lightning::util::bech32::u5type, for holding a 32-bit value, taken over frombech32 v0.9.1(This type is no longer available in bech32` crate) - It has simple type conversions (from/to
u8), Bech32 character encoding/decoding (from/tochar), ... - ... unpacking from packed bytes and packing to bytes (iterative), traits for conversions.
- External crate
bech32is still used directly in a few places; it has been upgraded to0.11.0.
More details in #3195 .
Potential further improvements:
- Encapsulate checksum functionality into
util/bech32.rs, by calling into external bech32 implementation - Add HRP decoding/encoding functionality to
bech32/mod.rs, by calling into external crate, or better, by own implementation (simple string parsing / concatenation) - Move tagged parsing logic to
bech32/mod.rs - Add test vectors from lightning invoice spec (BOLT11)
- Isolate bech32 functionality into a standalone crate (e.g.
bech32-ligthning). - Remove dependency to external crate completely
TODO:
- [x] fix
fuzztarget - [x] fmt
- [ ] squash
Codecov Report
Attention: Patch coverage is 94.52888% with 18 lines in your changes missing coverage. Please review.
Project coverage is 91.36%. Comparing base (
78c0eaa) to head (c8279fe). Report is 206 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #3201 +/- ##
==========================================
+ Coverage 89.80% 91.36% +1.56%
==========================================
Files 121 123 +2
Lines 100045 113913 +13868
Branches 100045 113913 +13868
==========================================
+ Hits 89845 104078 +14233
+ Misses 7533 7289 -244
+ Partials 2667 2546 -121
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Also, do you think this could possibly be split up into two commits? First the introduction of the new type, and then switching to it? Or would it cause too much havoc with check_commits and needing to add and then remove allow_unused?
Also, do you think this could possibly be split up into two commits? First the introduction of the new type, and then switching to it? Or would it cause too much havoc with check_commits and needing to add and then remove allow_unused?
It is possible to split up into two commits, it helps review, but check_commits will probably complain, and it should be squashed before merging. Or did you have in mind 2 separate PRs? Also, what would be the benefit? Since the addition is in a new file, it is quite easy to separate during review.
It is possible to split up into two commits, ...
Actually, it's not really possible to separate the external upgrade and the new own implementation:
- if external crate is upgraded, code has to be adopted throughout, which does require the new own implementation
- if the old version of the external crate is kept, the new implementation has to be adapted to the new version, which does not make much sense.
I'm a little confused about the motivation for this PR. Upstream you complained that Fe32 did not have Ord or Default, but if I remove those impls from your u5 nothing fails to compile in lightning except your test for Default being zero. It appears that you use these impls in lightning-invoice within ser.rs in a couple places:
One is to implement extend_from_stretch generically over T: Default. But actually you only call this function with u8 and u5, and when you call it with u5 the semantics are slightly different (in that you have this Optional return but you always call .expect on it). A short-diff fix for this is
diff --git a/lightning-invoice/src/ser.rs b/lightning-invoice/src/ser.rs
index 7317b5cc3..4e9bce317 100644
--- a/lightning-invoice/src/ser.rs
+++ b/lightning-invoice/src/ser.rs
@@ -177,6 +177,22 @@ impl Display for SiPrefix {
}
}
+fn encode_int_be_base32_fixed(int: u64, size: usize) -> Vec<u5> {
+ let base = 32u64;
+
+ let mut out_vec = Vec::<u5>::new();
+
+ let mut rem_int = int;
+ for _ in 0..size {
+ out_vec.push(u5::try_from_u8((rem_int % base) as u8).expect("always <32"));
+ rem_int /= base;
+ }
+ assert_eq!(rem_int, 0, "input {} did not fit into {} u5 words (remainder {})", int, size, rem_int);
+
+ out_vec.reverse();
+ out_vec
+}
+
fn encode_int_be_base32(int: u64) -> Vec<u5> {
let base = 32u64;
@@ -250,10 +266,7 @@ impl ToBase32 for RawDataPart {
impl ToBase32 for PositiveTimestamp {
fn write_base32<W: WriteBase32>(&self, writer: &mut W) -> Result<(), <W as WriteBase32>::Err> {
// FIXME: use writer for int encoding
- writer.write(
- &try_stretch(encode_int_be_base32(self.as_unix_timestamp()), 7)
- .expect("Can't be longer due than 7 u5s due to timestamp bounds")
- )
+ writer.write(&encode_int_be_base32_fixed(self.as_unix_timestamp() , 7))
}
}
@@ -415,10 +428,12 @@ impl ToBase32 for TaggedField {
assert!(len < 1024, "Every tagged field data can be at most 1023 bytes long.");
writer.write_u5(u5::try_from_u8(tag).expect("invalid tag, not in 0..32"))?;
- writer.write(&try_stretch(
- encode_int_be_base32(len as u64),
- 2
- ).expect("Can't be longer than 2, see assert above."))?;
+ writer.write(&encode_int_be_base32_fixed(len as u64, 2))?;
+
+ writer.write(&[
+ u5::try_from_u8((len / 32) as u8).unwrap(),
+ u5::try_from_u8((len % 32) as u8).unwrap(),
+ ])?;
payload.write_base32(writer)
}
But of course a proper fix would stop using Vec unnecessarily and get rid of these unwraps. For example you could use the upstream ByteIterExt trait then call uint64.to_be_bytes().iter().copied().bytes_to_fes() and then copy the output of that into a fixed-size array. (Or just collect if you want to keep the Vec and address it later, which might be easier to review.)
Then for Ord, it seems this is needed to derive Ord for RawTaggedField. But this is a two-variant enum, and while annoying, it is very easy to manually impl Ord for this:
diff --git a/lightning-invoice/src/lib.rs b/lightning-invoice/src/lib.rs
index 063ae8f40..e6fc5b1bc 100644
--- a/lightning-invoice/src/lib.rs
+++ b/lightning-invoice/src/lib.rs
@@ -419,7 +419,7 @@ impl From<Currency> for Network {
/// Tagged field which may have an unknown tag
///
/// This is not exported to bindings users as we don't currently support TaggedField
-#[derive(Clone, Debug, Hash, Eq, PartialEq, Ord, PartialOrd)]
+#[derive(Clone, Debug, Hash, Eq, PartialEq)]
pub enum RawTaggedField {
/// Parsed tagged field with known tag
KnownSemantics(TaggedField),
@@ -427,6 +427,23 @@ pub enum RawTaggedField {
UnknownSemantics(Vec<u5>),
}
+impl PartialOrd for RawTaggedField {
+ fn partial_cmp(&self, other: &Self) -> Option<core::cmp::Ordering> {
+ Some(self.cmp(other))
+ }
+}
+
+impl Ord for RawTaggedField {
+ fn cmp(&self, other: &Self) -> core::cmp::Ordering {
+ match (self, other) {
+ (RawTaggedField::KnownSemantics(..), RawTaggedField::UnknownSemantics(..)) => core::cmp::Ordering::Less,
+ (RawTaggedField::UnknownSemantics(..), RawTaggedField::KnownSemantics(..)) => core::cmp::Ordering::Greater,
+ (RawTaggedField::KnownSemantics(ref a), RawTaggedField::KnownSemantics(ref b)) => a.cmp(b),
+ (RawTaggedField::UnknownSemantics(ref a), RawTaggedField::UnknownSemantics(ref b)) => a.iter().map(|a| a.as_u8()).cmp(b.iter().map(|b| b.as_u8())),
+ }
+ }
+}
+
/// Tagged field with known tag
///
/// For descriptions of the enum values please refer to the enclosed type's docs.
And both of these changes could be done in commits independent of and prior to any bech32 changes, meaning that their review can be isolated from the rest of your changes.
Also, even if you did want to create your own u5 type, why reimplement things like conversion of u5strings to bytestrings when those exist upstream and can be easily implemented in terms of Fe32? Similar for the char conversions.
Thanks for looking into this in detail!
You are right that the Default requirement could worked around, but I think the Default trait makes sense for a numerical value, and it doesn't degrade code quality.
For Ord: correct, it is needed for the tagged values in lightning invoices, for the unparsed parts (UnknownSemantics). Ord and PartialOrd could be implmented as you suggest, but even in your implementation it converts to u8 and does the comparison. I think it's just simpler if u5 natively support it. (My -- now discarded -- solution was to wrap Fe32 in a struct with Ord implemented, and use that inside UnknownSemantics.)
If my goal was to keep using Fe32 these suggestions would make it possible.
Also, even if you did want to create your own u5 type, why reimplement things like conversion of u5strings to bytestrings when those exist upstream and can be easily implemented in terms of Fe32? Similar for the char conversions.
You are right, and I had a version like that. However, after weighing the conversions and adaptations needed vs. the code/logic to be duplicated, I arrived at the conclusion that it's easier to essentially duplicate some logic in the way suitable for the LDK project. Lightning invoice parsing is a basic operation for LDK, and I think it makes sense to have a well separated, minimal dependency module within LDK for lightning parsing, I'd like to move into that direction. In general I try to achieve high code reuse, but in this case maybe it's more practical to go towards more self-sufficient solution with no external dependency.
I have written up my concerns in the issue https://github.com/lightningdevkit/rust-lightning/issues/3195 .
But I'm listening to your opinion as well. I may revisit the version without code duplication and check the shortcoming again ( #3181 )
For Ord: correct, it is needed for the tagged values in lightning invoices, for the unparsed parts (
UnknownSemantics). Ord and PartialOrd could be implmented as you suggest, but even in your implementation it converts tou8and does the comparison. I think it's just simpler ifu5natively support it.
My suggestion does not convert to u8 anywhere.
And no, it is not "simpler" to implement a nonsensical ordering on a field element when there is a standard notion of "ordered field" which does not apply to the field in question.
this case maybe it's more practical to go towards more self-sufficient solution with no external dependency.
You intend to write your own bech32 checksum encoding and verification code? I will switch to the other issue to avoid splitting the conversation but this is a dramatic failure of the bech32 API if it is unable to do basic checksumming.
Put back to daft for now, depends on #3234, and it is still open whether to go the route of #3181 .
Pivoted back to reusing rust-bech32 as much as possible, now in #3244 (following #3234 ).