celestia-node icon indicating copy to clipboard operation
celestia-node copied to clipboard

refactor!(share/shwap): rework range namespace data

Open vgonkivs opened this issue 5 months ago • 2 comments

This PR refactors the range namespace data implementation, optimizing how proofs are handled for incomplete rows in namespace data retrieval. The changes focus on reducing redundant proof data.

closes https://github.com/celestiaorg/celestia-node/issues/4339

vgonkivs avatar Jun 13 '25 10:06 vgonkivs

Codecov Report

Attention: Patch coverage is 28.09224% with 343 lines in your changes missing coverage. Please review.

Project coverage is 35.68%. Comparing base (2469e7a) to head (4e5b378). Report is 522 commits behind head on main.

Files with missing lines Patch % Lines
share/shwap/range_namespace_data.go 51.83% 70 Missing and 22 partials :warning:
nodebuilder/share/get_range_result.go 0.00% 83 Missing :warning:
share/shwap/pb/shwap.pb.go 0.00% 58 Missing :warning:
share/eds/rsmt2d.go 0.00% 18 Missing :warning:
store/file/ods.go 0.00% 16 Missing :warning:
share/eds/validation.go 0.00% 13 Missing :warning:
share/eds/testing.go 20.00% 11 Missing and 1 partial :warning:
nodebuilder/share/share.go 0.00% 10 Missing :warning:
...re/shwap/p2p/bitswap/range_namespace_data_block.go 50.00% 4 Missing and 6 partials :warning:
share/shwap/getters/cascade.go 0.00% 9 Missing :warning:
... and 9 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4325      +/-   ##
==========================================
- Coverage   44.83%   35.68%   -9.15%     
==========================================
  Files         265      322      +57     
  Lines       14620    24631   +10011     
==========================================
+ Hits         6555     8790    +2235     
- Misses       7313    14885    +7572     
- Partials      752      956     +204     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov-commenter avatar Jun 13 '25 11:06 codecov-commenter

Thanks @renaynay for the deep review.

Addressing some of your comments.

side note but the name of the container now doesn't make sense as we've omitted ns -- since we are breaking anyway, should we shorten it? RangeData

The range must be under a single namespace anyway. We can't serve multi namespaces range.

If we are doing coords-based range only, then lets change method names to omit Namespace as it’s confusing for user. We can do it, yes. But we will have to add notes everywhere that requested range must be under the same namespace. Also, the API-level naming is GetRange(omits Namespace). if you think that the users will interact with range data on the p2p level(getters, accessors) and it will confuse them, then sure I'll rename it.

Should we add opts to range request at API level so we can add WithProofOnly later without breaking?

Do you mean on Getters/Accessors level? We will anyway break p2p level. All requests require specific -ID struct (SampleID, RangeID, NamespaceDataID) that is sent over the bitswap/shrex, so by adding proofsOnly functionality we will anyway break p2p(the structs have constant sizes and they are part of the sanity checks). Please note that we are planing to add proofsOnly for all shwap requests.

What is an “incomplete row proof” ? get_range_result.go

incomplete row proof is a proof for that range that does not span across the whole Row. Incomplete means we don't have enough shares (require subtree roots) to compute the RowRoot. For the complete Rows: len(shares) == odsSize, we don't need this proofs as we can rebuild it on the client side using shares.

And does it apply to EVERY start index / end index in the row?

Not sure if I understood your question, but it applies to the first and the last row of the range only. Since all range are under the same namespace, all intermediate row will be complete and we can use shares to compute the RowRoots.

What if I range request one row (completely)?


Both First/Last will be empty.

Where is VerifyNamespace used? Can we simplify VerifyShares then and reduce methods in this file? VerifyInclusion and VerifyNamespace can call verify inclusion with additional namespace completeness check in the future if we need it. Very confusing, all the diff methods with diff naming - hard to follow quickly what’s doing what

I have added it because I want to rework NamespaceDataContainer to become an alias of RangeNamespaceData and it will be necessary to verify the completeness of the namespace in the row. Can remove it if confusing.

What Is the point of isSingleRow ? just use isMultiRow or just use one of them lol

This part was done by Ganesh and it looks elegantly, so I decided to keep it. But you are right, will simplify a bit :slightly

vgonkivs avatar Jun 19 '25 10:06 vgonkivs