iceberg-rust icon indicating copy to clipboard operation
iceberg-rust copied to clipboard

Experiment implementation for catalog builder

Open liurenjie1024 opened this issue 8 months ago • 2 comments

What changes are included in this PR?

Demo implementation for catalog builder.

Are these changes tested?

UT.

liurenjie1024 avatar Apr 21 '25 10:04 liurenjie1024

I also want to remove builder for catalog configs.

liurenjie1024 avatar Apr 21 '25 10:04 liurenjie1024

cc @Xuanwo I've figured out a way to keep both traits without introducing an enum, see https://github.com/apache/iceberg-rust/blob/f4a2efea385230cd330026705316e16331a1c612/crates/catalog/loader/src/lib.rs#L10 WDYT?

liurenjie1024 avatar Apr 30 '25 07:04 liurenjie1024

cc @Fokko @sdd @Xuanwo Could you take a look at this pr to see if this is the right direction to go?

liurenjie1024 avatar May 22 '25 06:05 liurenjie1024

Are we sure that we need the CatalogBuilder trait? What would be its purpose?

Hi, @c-thiel Sorry for misclarification. For more background, please refer to https://github.com/apache/iceberg-rust/issues/1228. In short, we are trying to develop a catalog loader so that it can be used by some applications such as iceberg-playground or data driven integration test framework.

My initial idea would be to just have different typesafe builders for different catalogs. The current CatalogBuilder trait currently contains fields that are not required for some catalogs. For example in-memory or dynamo don't even need a uri.

I've refined the trait definition as proposed by @Xuanwo in #1372 .

On the other hand we have the rest catalog that needs significantly more configurations, and I don't think we want to make those key value pairs. I would be in favor to have the interfaces as typesafe as possible.

I think a pure type safe approach may not be practical since we also need to pass file io's configurations through catalog, and the actual file io used is determined at runtime.

As a compromise, we could define a config struct for each catalog like following:

struct RestCatalogConfig {
   #[serde(name = "name")]
   name: String,
  #[serde(name = "uri")]
   uri: Url,
  #[serde(name = "credential")]
   credential: String
}

impl TryFrom<HashMap<String, String>> for RestCatalogConfig {
 .....
}

WDYT?

liurenjie1024 avatar May 26 '25 09:05 liurenjie1024

Makes sense to me

sdd avatar May 27 '25 09:05 sdd

Hi all, wondering what's remaining on the catalog builder? Trying to see if I can resume this work and use the catalog loader API in the integration tests framework.

lliangyu-lin avatar Jul 30 '25 23:07 lliangyu-lin

Hi all, wondering what's remaining on the catalog builder? Trying to see if I can resume this work and use the catalog loader API in the integration tests framework.

Hi, @lliangyu-lin Here is the epic issue for catalog loader: https://github.com/apache/iceberg-rust/issues/1253

Since we already defined the interface, you could pick up left items to implement them for each catalog.

liurenjie1024 avatar Jul 31 '25 12:07 liurenjie1024