velox icon indicating copy to clipboard operation
velox copied to clipboard

Fix the problem where buildSelectedType() makes the input type invalid

Open nmahadevuni opened this issue 11 months ago • 16 comments

buildSelectedType() takes a constant typeWithId type tree and applies the selector on it to return a selected new type tree. However, it modifies the input typeWithId unexpectedly because it duplicates the internal type nodes, but not the leaves(primitive types). The leaf nodes were directly moved to the result instead of copied, thus making in the original input typeWithId incomplete.

std::shared_ptr<const TypeWithId> typeutils::buildSelectedType(
const std::shared_ptr<const TypeWithId>& typeWithId,
const std::function<bool(size_t)>& selector)  {
  return visit(typeWithId, selector);
}
std::shared_ptr<const TypeWithId> visit(
    const std::shared_ptr<const TypeWithId>& typeWithId,
    const std::function<bool(size_t)>& selector) {
  if (typeWithId->type()->isPrimitiveType()) {
    return typeWithId;   // directly returned
  }
  if (typeWithId->type()->isRow()) {
    std::vector<std::string> names;
    std::vector<std::shared_ptr<const TypeWithId>> typesWithId;  // constructing a new children vector
    std::vector<std::shared_ptr<const Type>> types;
    auto& row = typeWithId->type()->asRow();
    for (auto i = 0; i < typeWithId->size(); ++i) {
      auto& child = typeWithId->childAt(i);
      if (selector(child->id())) {
        names.push_back(row.nameOf(i));
        std::shared_ptr<const TypeWithId> twid;
        twid = visit(child, selector);   // twid was pointing to the original typeWithId leaf
        typesWithId.push_back(twid);   // twid pushed to the new vector typesWithId
        types.push_back(twid->type());
      }
    }
    VELOX_USER_CHECK(
        !types.empty(), "selected nothing from row: " + row.toString());
    return std::make_shared<TypeWithId>(.   // Creating a new type node
        ROW(std::move(names), std::move(types)),
        std::move(typesWithId),      // But with the children vector that contains the original leaves. And the original ones are now nullptr.
        typeWithId->id(),
        typeWithId->maxId(),
        typeWithId->column());

This PR fixes the problem by making a copy of the leaf nodes as well. After this change, the original input typeWithId would still be valid.

nmahadevuni avatar Mar 06 '24 06:03 nmahadevuni

Deploy Preview for meta-velox canceled.

Name Link
Latest commit f8d7e55ea6fed86b77e0eb75f416981a64313d56
Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/660e910999761d0008e789ee

netlify[bot] avatar Mar 06 '24 06:03 netlify[bot]

@nmahadevuni Where exactly is the leaf node changed? Just sharing the leaf nodes won't make the input type invalid. We should copy only when we mutate the leaf node, not when we project them.

Yuhta avatar Mar 11 '24 15:03 Yuhta

Where exactly is the leaf node changed? Just sharing the leaf nodes won't make the input type invalid. We should copy only when we mutate the leaf node, not when we project them.

The leaf node was moved away from the original type tree to the selected type tree. Please see my explanation in the PR message.

yingsu00 avatar Mar 12 '24 15:03 yingsu00

@yingsu00 Which line is it?

Yuhta avatar Mar 12 '24 16:03 Yuhta

@Y

@nmahadevuni Where exactly is the leaf node changed? Just sharing the leaf nodes won't make the input type invalid. We should copy only when we mutate the leaf node, not when we project them.

@Yuhta Please see the comments in the below code.

return std::make_shared<TypeWithId>(.   // Creating a new type node
        ROW(std::move(names), std::move(types)),
        std::move(typesWithId),      // But with the children vector that contains the original leaves. And the original ones are now nullptr.
        typeWithId->id(),
        typeWithId->maxId(),
        typeWithId->column());

The bottom line is that after buildSelectedType() is called, the original input typeWithId should still be valid. But since std::move was called, the leaf nodes in the original input typeWithId tree became nullptr.

yingsu00 avatar Mar 12 '24 16:03 yingsu00

@yingsu00 These moves are called on local variables only, not reference to the original type tree

Yuhta avatar Mar 12 '24 16:03 Yuhta

@yingsu00 These moves are called on local variables only, not reference to the original type tree

The input is modified when we assign the parent_ in TypeWithId's constructor, the primitiveType's parent is changed to the newly created parent in the visit() method. And so the input schema is also getting modified since the primitiveType is not copied.

const_cast<const TypeWithId*&>(child->parent_) = this;

nmahadevuni avatar Mar 13 '24 09:03 nmahadevuni

@nmahadevuni I see, this is a very badly designed class. Can you

  1. Change the constructor signature to take std::vector<std::unique_ptr<const TypeWithId>> children so that we don't accidentally reparenting existing nodes
  2. Comment on the constructor that it will reparent the children?

Yuhta avatar Mar 13 '24 14:03 Yuhta

@nmahadevuni I see, this is a very badly designed class. Can you

  1. Change the constructor signature to take std::vector<std::unique_ptr<const TypeWithId>> children so that we don't accidentally reparenting existing nodes
  2. Comment on the constructor that it will reparent the children?

@Yuhta This refactor need lot of files to be changed. Can we do it in a different PR?

nmahadevuni avatar Mar 16 '24 13:03 nmahadevuni

@yingsu00 @Yuhta I added a test which SEGVs without the fix. Please review again.

nmahadevuni avatar Mar 18 '24 17:03 nmahadevuni

@nmahadevuni Do you have estimation how large the change need to be? Most of the places we are just passing the TypeWithId around, only the calls to the constructor need to be changed, which I think probably are not a lot.

Yuhta avatar Mar 19 '24 15:03 Yuhta

  1. std::vector<std::unique_ptr<const TypeWithId>> children

@Yuhta Almost all Parquet/DWRF readers/writers and column readers/writers will need to be changed, since they all expect shared_ptr and if we change the TypeWithId to take unique_ptr, and buildSelectedTypes to return unique_ptr. Even if we contain the changes to just TypeWithId and TypeUtils.cpp by somehow converting the unique_ptr to shared_ptr, then too, we need to change all column readers/writers of both formats, since they access the child types.

nmahadevuni avatar Mar 25 '24 11:03 nmahadevuni

@yingsu00 @Yuhta Addressed the comments. Please review.

nmahadevuni avatar Mar 25 '24 13:03 nmahadevuni

@nmahadevuni Would you rebase your tests on https://github.com/facebookincubator/velox/pull/9244?

Yuhta avatar Mar 25 '24 20:03 Yuhta

@nmahadevuni Would you rebase your tests on #9244?

@Yuhta Updated the branch with the tests. Thanks.

nmahadevuni avatar Mar 28 '24 07:03 nmahadevuni

@nmahadevuni Can you rebase to main with the tests?

Yuhta avatar Apr 02 '24 00:04 Yuhta

@nmahadevuni Can you rebase to main with the tests?

@Yuhta I rebased with the tests. Thanks.

nmahadevuni avatar Apr 04 '24 12:04 nmahadevuni

@Yuhta has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot avatar Apr 04 '24 17:04 facebook-github-bot

@Yuhta merged this pull request in facebookincubator/velox@d4316c5f9ce5099e92f4d7af3ed607bce33a726b.

facebook-github-bot avatar Apr 05 '24 22:04 facebook-github-bot

Conbench analyzed the 1 benchmark run on commit d4316c5f.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

conbench-facebook[bot] avatar Apr 05 '24 22:04 conbench-facebook[bot]