starrocks icon indicating copy to clipboard operation
starrocks copied to clipboard

[Feature] Unified catalog support Kudu

Open predator4ann opened this issue 1 year ago • 6 comments

  • [x] https://github.com/StarRocks/starrocks/pull/45948
  • [x] https://github.com/StarRocks/starrocks/pull/45590

Feature request

Is your feature request related to a problem? Please describe. Generally speaking, kudu's metadata is also stored in hms, like hive, iceberg, and hudi. Therefore, unified catalog could also support kudu's data reading to simplify user usage costs.

Describe the solution you'd like https://github.com/StarRocks/starrocks/pull/45590

Describe alternatives you've considered Nope.

Additional context Nope.

predator4ann avatar May 14 '24 08:05 predator4ann

Hi, may I ask the scenario? Unified connector is more like a middle stage in migrating from hive to iceberg/hudi.., cause iceberg/hudi are common upgrade methods for hive, users don't need to care about which catalog they are using. or in some companies, for example, there are both iceberg and hudi, although this is not common to see, they can use a unified connector to unify user experience. but for Kudu I think is a different scenario

wangsimo0 avatar May 15 '24 07:05 wangsimo0

Hi, may I ask the scenario? Unified connector is more like a middle stage in migrating from hive to iceberg/hudi.., cause iceberg/hudi are common upgrade methods for hive, users don't need to care about which catalog they are using. or in some companies, for example, there are both iceberg and hudi, although this is not common to see, they can use a unified connector to unify user experience. but for Kudu I think is a different scenario

Of course, in our scenario, the metadata of hive/iceberg/kudu is stored in hms. When users use Trino, they don't need to worry about what type of table they are accessing, they can use hive uniformly for access. The engine can automatically route to the correct catalog, just like the purpose of unified catalog. Therefore, I believe that as long as the storage of metadata is the same, it can be accessed through a catalog, which can greatly simplify user usage.

predator4ann avatar May 15 '24 14:05 predator4ann

Hi, may I ask the scenario? Unified connector is more like a middle stage in migrating from hive to iceberg/hudi.., cause iceberg/hudi are common upgrade methods for hive, users don't need to care about which catalog they are using. or in some companies, for example, there are both iceberg and hudi, although this is not common to see, they can use a unified connector to unify user experience. but for Kudu I think is a different scenario

Of course, in our scenario, the metadata of hive/iceberg/kudu is stored in hms. When users use Trino, they don't need to worry about what type of table they are accessing, they can use hive uniformly for access. The engine can automatically route to the correct catalog, just like the purpose of unified catalog. Therefore, I believe that as long as the storage of metadata is the same, it can be accessed through a catalog, which can greatly simplify user usage.

you mean the table redirection feature in trino? that also works between hive connector and kudu connector when it comes to hms?

wangsimo0 avatar May 16 '24 07:05 wangsimo0

Hi, may I ask the scenario? Unified connector is more like a middle stage in migrating from hive to iceberg/hudi.., cause iceberg/hudi are common upgrade methods for hive, users don't need to care about which catalog they are using. or in some companies, for example, there are both iceberg and hudi, although this is not common to see, they can use a unified connector to unify user experience. but for Kudu I think is a different scenario

Of course, in our scenario, the metadata of hive/iceberg/kudu is stored in hms. When users use Trino, they don't need to worry about what type of table they are accessing, they can use hive uniformly for access. The engine can automatically route to the correct catalog, just like the purpose of unified catalog. Therefore, I believe that as long as the storage of metadata is the same, it can be accessed through a catalog, which can greatly simplify user usage.

you mean the table redirection feature in trino? that also works between hive connector and kudu connector when it comes to hms?

Yes, the community supports the redirect of hive to iceberg, and we have extended the redirect of hive to kudu ourselves

predator4ann avatar May 17 '24 02:05 predator4ann

@mergify backport branch-3.3

miomiocat avatar May 20 '24 12:05 miomiocat

@mergify backport branch-3.3

miomiocat avatar May 20 '24 12:05 miomiocat