web3.storage
web3.storage copied to clipboard
Inaccurate claimed DAG sizes
DAG sizes for some upload types are incorrect (reporting smaller than actual). The delta between size_claimed and size_actual can get quite big.
This issue is a sister issue to: https://github.com/nftstorage/nft.storage/issues/1427, but higher prio here, given that we are tracking upload sizes for account limits.
@mbommerez going to add this to the shortlist! It somewhat blocks the account limit restrictions.
Investigation from @flea89 in NFT #1427:
I haven't gone too deep yet, but I'll start sharing my Initial investigation results (in web3.storage):
Data sample 6484 cids:
- Most of the "problematic" cids are
cbor
ones (5392 out of 6484) - For pb one, 50 with problems out of 50 I've checked are directories
Looking at the code, possible roots of the problem are:
- for
codec pb
we rely on metadata to calculate size, which could be deliberately changed - in carStat we're calculating size for code PB and raw (with one block), and not CBOR.
CBOR dags
Given the size calculation doesn't happen in .storage AFAICT, I wonder if size_claimed
is populated for those cids wrongly in cargo? But I haven't had time to look there yet.
From a quick look, I suspect size_claimed
stores the size of the first block rather than the whole dag.
PB directories I just quickly checked a couple of CIDs, and in this case the we're actually reporting a bigger size in .storage. ie. CID: bafybeiduwb4o2fsl2lbmuyigzhjdpluahrexjpd7edlilyl5wmz332vnyq public.content.size = 715 cargo.dag.size_actual = 690
> ipfs dag stat /ipfs/bafybeiduwb4o2fsl2lbmuyigzhjdpluahrexjpd7edlilyl5wmz332vnyq
> Size: 690, NumBlocks: 7
> ipfs files stat /ipfs/bafybeiduwb4o2fsl2lbmuyigzhjdpluahrexjpd7edlilyl5wmz332vnyq
> bafybeiduwb4o2fsl2lbmuyigzhjdpluahrexjpd7edlilyl5wmz332vnyq
> Size: 0
> CumulativeSize: 715
> ChildBlocks: 1
> Type: directory
I haven't yet checked why the 2 reports different sizes, (is it unixFs headers or a bug) but I'm sure you know @alanshaw.
@alanshaw can you run a query in prod where you use the dag size from public.content
, to make sure this is really a problem for .storage
?
Looks like the PRs are merged and deployed! Closing this issue now. Welcome to reopen if we need to!