Why `IndexTupleData` size is 2
Hi, I noticed when you compute the b-tree bloat, the IndexTupleData size is 2. But I couldn't understand why the size of t_tid + t_info is 2 after I read the source code the 35th line in src/include/access/itup.h?
your sql:
CASE WHEN max(coalesce(s.stanullfrac,0)) = 0
THEN 2 -- IndexTupleData size
ELSE 2 + (( 32 + 8 - 1 ) / 8) -- IndexTupleData size + IndexAttributeBitMapData size ( max num filed per index + 8 - 1 /8)
END AS index_tuple_hdr_bm,
the source code:
typedef struct IndexTupleData
{
ItemPointerData t_tid; /* reference TID to heap tuple */
/* ---------------
* t_info is laid out in the following fashion:
*
* 15th (high) bit: has nulls
* 14th bit: has var-width attributes
* 13th bit: AM-defined meaning
* 12-0 bit: size of tuple
* ---------------
*/
unsigned short t_info; /* various info about tuple*/
} IndexTupleData; /* MORE DATA FOLLOWS AT END OF STRUCT */
Hi @shangzixie,
Right, it seems t_tid is missing here.
According to the source code, it adds 3 bytes to the tuple header:
// From src/include/storage/itemptr.h
/*
* ItemPointer:
*
* [...] The struct is designed to be six bytes long
* (it contains three int16 fields) but a few compilers will pad it to
* eight bytes unless coerced. [...]
*/
typedef struct ItemPointerData
{
BlockIdData ip_blkid;
OffsetNumber ip_posid;
}
// From src/include/storage/off.h
typedef uint16 OffsetNumber;
// From src/include/storage/block.h
typedef struct BlockIdData
{
uint16 bi_hi;
uint16 bi_lo;
} BlockIdData;
If this is correct, the exact formula should be:
CASE WHEN max(coalesce(s.stanullfrac,0)) = 0
THEN 5 -- IndexTupleData size
ELSE 5 + (( 32 + 8 - 1 ) / 8) -- IndexTupleData size + IndexAttributeBitMapData size ( max num filed per index + 8 - 1 /8)
END AS index_tuple_hdr_bm,
Because of the alignment padding, I suspect this would "fix" the estimation for index with nullable values or on 32 bits archs.
Did you find this just while studying the query, or do you have a test-case showcasing a bad estimation so we can compare with this fix?
Good catch anyway, thanks!
yes @ioguix , I just studied the query recently.
I don't know if you could add 6 bytes rather than 5 bytes. Maybe the t_tid is 6 bytes?
/*
* ItemPointer:
*
* This is a pointer to an item within a disk page of a known file
* (for example, a cross-link from an index to its parent table).
* ip_blkid tells us which block, ip_posid tells us which entry in
* the linp (ItemIdData) array we want.
*
* Note: because there is an item pointer in each tuple header and index
* tuple header on disk, it's very important not to waste space with
* structure padding bytes. The struct is designed to be six bytes long
* (it contains three int16 fields) but a few compilers will pad it to
* eight bytes unless coerced. We apply appropriate persuasion where
* possible. If your compiler can't be made to play along, you'll waste
* lots of space.
*/
typedef struct ItemPointerData
{
BlockIdData ip_blkid; // this is 32 bits
OffsetNumber ip_posid; // this is 16 bits
}
I don't know if you could add 6 bytes rather than 5 bytes. Maybe the t_tid is 6 bytes?
Pfff, obviously, my bad, I've been heedless... Worth thing is I actually read this comment about the struct being 6 bytes :sweat:
lol, Thanks for reply ! !
I keep this issue open, I'll close it once fixed.
Thank you!