hudi icon indicating copy to clipboard operation
hudi copied to clipboard

[HUDI-3345][RFC-36] Hudi metastore server

Open minihippo opened this issue 3 years ago • 15 comments

What is the purpose of the pull request

A new rfc for hudi metastore server

Committer checklist

  • [ ] Has a corresponding JIRA in PR title & commit

  • [ ] Commit message is descriptive of the change

  • [ ] CI is green

  • [ ] Necessary doc changes done or have another open PR

  • [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

minihippo avatar Jan 29 '22 14:01 minihippo

CI report:

  • 3208c9fe7de1c45e12a07debdeaa30239aff23aa Azure: FAILURE
Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

hudi-bot avatar Jan 29 '22 16:01 hudi-bot

@minihippo Picking this back up again. What are the next steps in our plan here?

vinothchandar avatar Mar 10 '22 19:03 vinothchandar

@minihippo Picking this back up again. What are the next steps in our plan here?

@vinothchandar Thanks for the review,

  1. More details for RFC
  2. I will submit a pr about the initial hudi-metastore module which supports basic functions next week

minihippo avatar Mar 12 '22 02:03 minihippo

@minihippo Sounds good! We can revisit once you have the basic PR out

vinothchandar avatar Mar 30 '22 23:03 vinothchandar

@minihippo This is a great work👍, I think it can also solve the problem I recently met: HUDI-3634 as we keep commit instants consistent in the hudi metastore server.

But I'm curious how spark side get metadata of a hudi table(stored in the hudi metastore server) and a hive table (stored in the HMS) in one query(like a hudi table join a hive table)? Will we handle this in the HudiCatalog to get hudi table metadata from hudi metastore server and hive table from HMS, or we provide a unified view in the hudi metastore server, let hudi metastore to request HMS server if it's a hive table?

boneanxs avatar Mar 31 '22 11:03 boneanxs

Very valuable idea!

Further, maybe we can do more interesting things based on this very valuable hudi metastore server, which is beneficial to realize Hudi Lake Manager which could decouple hudi ingestion and hudi table service, including cleaner, archival, clustering, compaction and any table services in the feature.

And this lake manager could unify and automatically call put services such as cleaner/clustering/compaction/archive(multi-writer and async) based on this metastore server.

Users only need to care about their own ingest pipline and leave all the table services to the manager to automatically discover and manage the hudi table thereby greatly reducing the pressure of operation and maintenance and the cost of on board.

Maybe We could expand this RFC or raising a new RFC and take this MTS as informations inputs?

CC @yihua and @nsivabalan

zhangyue19921010 avatar Apr 18 '22 06:04 zhangyue19921010

@minihippo This is a great work👍, I think it can also solve the problem I recently met: HUDI-3634 as we keep commit instants consistent in the hudi metastore server.

But I'm curious how spark side get metadata of a hudi table(stored in the hudi metastore server) and a hive table (stored in the HMS) in one query(like a hudi table join a hive table)? Will we handle this in the HudiCatalog to get hudi table metadata from hudi metastore server and hive table from HMS, or we provide a unified view in the hudi metastore server, let hudi metastore to request HMS server if it's a hive table?

@boneanxs In ByteDance in house implementation, we do more like the second way. There is a proxy over the hudi metastore server and hive metastore server. The proxy routes requests to the corresponding server according to the table type.

minihippo avatar Apr 25 '22 17:04 minihippo

Very valuable idea!

Further, maybe we can do more interesting things based on this very valuable hudi metastore server, which is beneficial to realize Hudi Lake Manager which could decouple hudi ingestion and hudi table service, including cleaner, archival, clustering, compaction and any table services in the feature.

And this lake manager could unify and automatically call put services such as cleaner/clustering/compaction/archive(multi-writer and async) based on this metastore server.

Users only need to care about their own ingest pipline and leave all the table services to the manager to automatically discover and manage the hudi table thereby greatly reducing the pressure of operation and maintenance and the cost of on board.

Maybe We could expand this RFC or raising a new RFC and take this MTS as informations inputs?

CC @yihua and @nsivabalan

@zhangyue19921010 https://github.com/apache/hudi/pull/4309 here it is.

minihippo avatar Apr 25 '22 17:04 minihippo

Yeap, I read https://github.com/apache/hudi/pull/4309 RFC. What i am thinking is that could we expand this scope. Maybe is more common infrastructure not only clustering/compaction but also clean, archive and any other service in the future :)

zhangyue19921010 avatar Apr 26 '22 06:04 zhangyue19921010

@zhangyue19921010 Yes. It's on the list. Hi @yuzhaojing could u supply this part in the RFC?

minihippo avatar Apr 26 '22 12:04 minihippo

On this RFC, I think the main thing is to decide the first phase scope. IMO, it can be limited to just Hudi tables for now and depending on whether a hudi.metastore.uris is configured or not, the queries will use this metaserver or not.

Does the RFC address high availability/sharding of metadata? Have you thought about these? If the metastore will also deal with locks, then the servers will become stateful. May be we can phase them as well? @minihippo thoughts?

vinothchandar avatar Apr 26 '22 23:04 vinothchandar

@vinothchandar sorry for replying the comments so late. When design the storage schema of metadata store, tbl_id is in each storage table so that metadata could be sharded by tbl_id, and all metadata of a table is in one shard. There are no problems about joining across the shard.

minihippo avatar Jun 07 '22 14:06 minihippo

Short-term plan (target 1.0)

Phase1

Implement the basic functions

  1. Databases and tables store
  2. All actions (i.e. commit, compaction) and operations (i.e. upsert, compact, cluster)
  3. Timeline, instant meta store.
  4. Partition, snapshot store.
  5. Spark/ Flink read/write available based on metastore
  6. Parameters of table/partition level persistence.
  7. e.g. table config

Phase2

Extensions

  1. Schema store and support schema evolution
  2. Concurrency support (will submit a new rfc)
  3. Hudi catalog

minihippo avatar Jun 07 '22 14:06 minihippo