bookkeeper icon indicating copy to clipboard operation
bookkeeper copied to clipboard

LedgerEntry#getLength does not do what the documentation says

Open ivankelly opened this issue 7 years ago • 12 comments

You would expect getLength on a LedgerEntry to return the length of the data in the entry. The documentation says it does this, and the Mock implementation do so too. But this is not what it does. It in fact returns the length of the ledger up to and including this entry.

At the very least this is a documentation bug, but this name is super confusing even if the documentation is updated.

ivankelly avatar May 17 '18 14:05 ivankelly

good catch

eolivelli avatar May 17 '18 14:05 eolivelli

for BC consideration, we can fix the documentation first.

We can consider deprecating getLength later, introducing getEntrySize to return the size of the entry and getOffset to return the offset of the entry in a ledger. entrySize + offset == getLength here.

Offset would be aligning with the proposal in #1376

sijie avatar May 17 '18 15:05 sijie

there shouldn't be a BC consideration for the new API

ivankelly avatar May 17 '18 16:05 ivankelly

@ivankelly okay you are talking about the new API. for new API it is fine. I was talking about the old API. http://bookkeeper.apache.org/docs/latest/api/javadoc/org/apache/bookkeeper/client/LedgerEntry.html#getLength--

sijie avatar May 17 '18 16:05 sijie

Ya, old API can stay as it is. Maybe deprecate it. I seem to remember it being very broken at some point in any case, and noone complained.

ivankelly avatar May 17 '18 16:05 ivankelly

+1 for fixing only the new API and deprecate old API method

eolivelli avatar May 17 '18 16:05 eolivelli

I seem to remember it being very broken at some point in any case, and noone complained.

why? it is not broken, we used that at twitter to know how many bytes a reader is lagging behind a writer, by comparing the lengths of two entries.

sijie avatar May 17 '18 17:05 sijie

let's submit a BP and fix the new API and deprecate the method in the old API.

sijie avatar May 17 '18 17:05 sijie

@sijie This one (https://issues.apache.org/jira/browse/BOOKKEEPER-673). In my head it was more recent than almost 5 years ago. Getting old I guess.

ivankelly avatar May 17 '18 21:05 ivankelly

@ivankelly I see. but that's more about consistency between metadata and the information in entries. I don't know a direct usage of length in metadata, but the information in entries is more useful.

sijie avatar May 17 '18 23:05 sijie

Pulsar tiered storage usage uses the length in the metadata.

ivankelly avatar May 18 '18 07:05 ivankelly

I've also encountered this problem in a BookKeeper test. It seems to still exist in the latest version of BookKeeper. Currently, the LedgerEntry#getLength API is used in several BookKeeper shell commands to print the entry size, as well as in a Pulsar indicator called pulsar_ml_cursor_readLedgerSize. Should I first discuss this on the mailing list before attempting to fix it? @eolivelli @sijie

Shawyeok avatar Sep 01 '23 10:09 Shawyeok