bustub icon indicating copy to clipboard operation
bustub copied to clipboard

bug: misues of INTERNAL_PAGE_SIZE in BPlusTreeInternalPage

Open Roscky opened this issue 1 year ago • 5 comments

The definition of INTERNAL_PAGE_SIZE in BPlusTreeInternalPage is correct, but passing INTERNAL_PAGE_SIZE directly in b_plus_tree.h is incorrect because ValueType in BPlusTree is not the same as the ValueType corresponding to BPlusTreeInternalPage. Therefore, the macro definition of INTERNAL_PAGE_SIZE here leads to incorrect calculation results, which can cause wasted space in internal nodes.

The macro definition of INTERNAL_PAGE_SIZE in b_plus_internal_tree.h.

#define INTERNAL_PAGE_SIZE ((BUSTUB_PAGE_SIZE - INTERNAL_PAGE_HEADER_SIZE) / (sizeof(MappingType)))

INTERNAL_PAGE_SIZE is used in b_plus_tree.h

explicit BPlusTree(std::string name, page_id_t header_page_id, BufferPoolManager *buffer_pool_manager,
                     const KeyComparator &comparator, int leaf_max_size = LEAF_PAGE_SIZE,
                     int internal_max_size = INTERNAL_PAGE_SIZE);

Since the ValueType of MappingType in b_plus_tree.h is RID, which is incorrect for InternalPage, the INTERNAL_PAGE_SIZE will be miscalculated.

Roscky avatar Apr 16 '23 11:04 Roscky

I think it doesn't cause the situation.

The macro INTERNAL_PAGE_SIZE is defined in b_plus_internal_tree.h.

// valuetype for internalNode should be page id_t
template class BPlusTreeInternalPage<GenericKey<4>, page_id_t, GenericComparator<4>>;
template class BPlusTreeInternalPage<GenericKey<8>, page_id_t, GenericComparator<8>>;

But the template on b_plust_tree.h use the page_id_t type and ValueType is setted as page_id_t. So INTERNAL_PAGE_SIZE can be calculated correctly.

And internal_max_size_ and leaf_max_size_ will be setted based on page type during btree NewPage in b_plus_tree.h. btw, macros are indeed easily misunderstood.

infdahai avatar Apr 17 '23 14:04 infdahai

[bustub-private/src/storage/index/b_plus_tree.cpp:22:BPlusTree] INFO  - Value_size:8
PageId_size:4
INTERNAL_PAGE_SIZE:254

[bustub-private/src/storage/page/b_plus_tree_leaf_page.cpp:38:Init] INFO  - Value_size:8
Mapping_size:16
LEAF_PAGE_SIZE:254

[bustub-private/src/storage/page/b_plus_tree_internal_page.cpp:37:Init] INFO  - Value_size:4
Mapping_size:12
INTERNAL_PAGE_SIZE:339

I add print statements and find that INTERNAL_PAGE_SIZE is right in b_plus_tree_internal_page.cpp but wrong in b_plus_tree.h . so the error occurs that causes INTERNAL_PAGE_SIZE is same to LEAF_PAGE_SIZE.

we can try to fix this in #517 and #547

infdahai avatar Apr 18 '23 08:04 infdahai

Actually 254 should be the correct max size, because we need to align MappingType. The computation is wrong, but accidentally gets the correct result.

skyzh avatar Apr 18 '23 12:04 skyzh

When the KeyType takes 8 bytes, the sizeof(MappingType) == 16 is correct. However, when the KeyType takes 1/2/4 bytes, the MappingType of internal node should align 4 bytes since the sizeof(KeyType) <= 4 and the sizeof(ValueType) == 4, so the sizeof(MappingType) == 8 and the internal_max_size should be more than 254.

Roscky avatar Apr 19 '23 03:04 Roscky

Yes, that’s something we should fix.

skyzh avatar Apr 19 '23 04:04 skyzh