[Bug]: Failed to create partitioned iceberg table with rest catalog
What happened
It seems Partition field IDs must be greater than or equal to 1000 is unnecessary. Is it a restriction required by nessie? I don't see this restriction on the spec of iceberg.
2025-04-21 10:09:34,095 WARN [org.pro.cat.ser.res.IcebergErrorMapper] (executor-thread-5) Unhandled exception returned as HTTP/500: jakarta.ws.rs.WebApplicationException: HTTP 400 Bad Request: jakarta.ws.rs.WebApplicationException: HTTP 400 Bad Request
at io.quarkus.resteasy.reactive.jackson.runtime.serialisers.FullyFeaturedServerJacksonMessageBodyReader.readFrom(FullyFeaturedServerJacksonMessageBodyReader.java:88)
at io.quarkus.resteasy.reactive.jackson.runtime.serialisers.FullyFeaturedServerJacksonMessageBodyReader.readFrom(FullyFeaturedServerJacksonMessageBodyReader.java:105)
at io.quarkus.resteasy.reactive.jackson.runtime.serialisers.FullyFeaturedServerJacksonMessageBodyReader_ClientProxy.readFrom(Unknown Source)
at org.jboss.resteasy.reactive.server.handlers.RequestDeserializeHandler.readFrom(RequestDeserializeHandler.java:126)
at org.jboss.resteasy.reactive.server.handlers.RequestDeserializeHandler.handle(RequestDeserializeHandler.java:84)
at io.quarkus.resteasy.reactive.server.runtime.QuarkusResteasyReactiveRequestContext.invokeHandler(QuarkusResteasyReactiveRequestContext.java:135)
at org.jboss.resteasy.reactive.common.core.AbstractResteasyReactiveContext.run(AbstractResteasyReactiveContext.java:147)
at io.quarkus.vertx.core.runtime.VertxCoreRecorder$15.runWith(VertxCoreRecorder.java:638)
at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2675)
at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2654)
at org.jboss.threads.EnhancedQueueExecutor.runThreadBody(EnhancedQueueExecutor.java:1627)
at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1594)
at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11)
at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot construct instance of `org.projectnessie.catalog.formats.iceberg.meta.ImmutableIcebergPartitionField`, problem: Partition field IDs must be greater than or equal to 1000
at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 401] (through reference chain: org.projectnessie.catalog.formats.iceberg.rest.ImmutableIcebergCreateTableRequest$Json["partition-spec"]->org.projectnessie.catalog.formats.iceberg.meta.ImmutableIcebergPartitionSpec$Json["fields"]->java.util.ArrayList[0])
at com.fasterxml.jackson.databind.exc.ValueInstantiationException.from(ValueInstantiationException.java:47)
at com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:2015)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:622)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:645)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator._createUsingDelegate(StdValueInstantiator.java:682)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createUsingDelegate(StdValueInstantiator.java:317)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1489)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:348)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:361)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:246)
at com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:30)
at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:310)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:215)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1490)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:348)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185)
at com.fasterxml.jackson.databind.deser.impl.MethodProperty.deserializeAndSet(MethodProperty.java:129)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.vanillaDeserialize(BeanDeserializer.java:310)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:215)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:187)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1490)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:348)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185)
at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:342)
at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:2125)
at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1501)
at io.quarkus.resteasy.reactive.jackson.runtime.serialisers.FullyFeaturedServerJacksonMessageBodyReader.doReadFrom(FullyFeaturedServerJacksonMessageBodyReader.java:116)
at io.quarkus.resteasy.reactive.jackson.runtime.serialisers.FullyFeaturedServerJacksonMessageBodyReader.readFrom(FullyFeaturedServerJacksonMessageBodyReader.java:66)
... 15 more
Caused by: java.lang.IllegalStateException: Partition field IDs must be greater than or equal to 1000
at com.google.common.base.Preconditions.checkState(Preconditions.java:574)
at org.projectnessie.catalog.formats.iceberg.meta.IcebergPartitionField.check(IcebergPartitionField.java:83)
at org.projectnessie.catalog.formats.iceberg.meta.ImmutableIcebergPartitionField.validate(ImmutableIcebergPartitionField.java:317)
at org.projectnessie.catalog.formats.iceberg.meta.ImmutableIcebergPartitionField$Builder.build(ImmutableIcebergPartitionField.java:469)
at org.projectnessie.catalog.formats.iceberg.meta.ImmutableIcebergPartitionField.fromJson(ImmutableIcebergPartitionField.java:270)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at com.fasterxml.jackson.databind.introspect.AnnotatedMethod.call1(AnnotatedMethod.java:110)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator._createUsingDelegate(StdValueInstantiator.java:666)
... 41 more
How to reproduce it
- use ghcr.io/projectnessie/nessie:0.103.3-java
- use RisingWave to create an iceberg sink with
partition_by
create table t83 (city varchar, lat double, long double);
insert into t83 values ('t83', 82, 82);
create sink sink_t83 from t83 WITH (
connector = 'iceberg',
type = 'append-only',
force_append_only = 'true',
database.name = 'def',
table.name = 't83',
catalog.name = 'demo',
catalog.type = 'rest',
catalog.uri = 'http://127.0.0.1:19120/iceberg/',
....
commit_checkpoint_interval = 1,
create_table_if_not_exists = 'true',
partition_by='city'
);
Nessie server type (docker/uber-jar/built from source) and version
nessie:0.103.3-java
Client type (Ex: UI/Spark/pynessie ...) and version
RisingWave
Additional information
No response
Faced the same issue, is there any workaround for it?
@snazy : WDYT?
Same issue for me, using nessie 0.105.7
Caused by: java.lang.IllegalStateException: Partition field IDs must be greater than or equal to 1000 at com.google.common.base.Preconditions.checkState(Preconditions.java:574)
Tried different combinations: to_char(kafka_timestamp::timestamp with time zone, 'yyyyMMdd')::INTEGER as arrival_day FROM rw_sqlmesh_example.geoenriched_model;
CREATE SINK enriched_iceberg FROM rw_sqlmesh_example.enriched_model
WITH (
connector = 'iceberg',
type = 'append-only',
database.name = 'raw',
table.name = 'enriched',
catalog.type = 'rest',
catalog.uri = 'http://nessie:19120/iceberg',
catalog.name = 'nessie',
s3.access.key = 'admin',
s3.secret.key = 'password',
s3.path.style.access = 'true',
s3.endpoint = 'http://nginx:9000',
s3.region = 'us-west-2',
partition_by = 'arrival_day',
create_table_if_not_exists = TRUE
);
RisingWave sets the field_id to integers starting from enumeration: https://github.com/risingwavelabs/risingwave/blob/5ef31eea47b04d49f56ad29c7443aaaad0cbda74/src/connector/src/sink/iceberg/mod.rs#L667
However, it is hardcoded as a limit in the Iceberg implementation itself: https://github.com/apache/iceberg/blob/90bfc3dff5f40f9ba886832ba9ccaa7b42298e9a/api/src/main/java/org/apache/iceberg/PartitionSpec.java#L53
Edit: This is actually in the specification:
The field-id property was added for each partition field in v2. In v1, the reference implementation assigned field ids sequentially in each spec starting at 1,000. See Partition Evolution for more details.
The change from v1 -> v2 is that the field-ids are persisted in the spec, instead of reassigned. This should be on the RisingWave iceberg sink connector to set the fields properly at 1000+ ids.