nessie icon indicating copy to clipboard operation
nessie copied to clipboard

Unable to send huge metadata (20MB) from Iceberg to Nessie.

Open ajantha-bhat opened this issue 2 years ago • 4 comments

Environment:

  • Nessie 0.43.0 (running by ./gradlew quarkusDev )
  • Iceberg 0.14.1 plus custom code to send huge metadata if the table name contains 'big' keyword (https://github.com/ajantha-bhat/iceberg/commit/c30b5412f8e29a7da88c4d2562c5ae7e2b7dd68f)
  • Spark3.3

Query:

create table nessie.db1.bigt1(id int) using iceberg; --fails

Callstack:

22/10/12 12:46:45 ERROR SparkSQLDriver: Failed in [create table nessie.db1.bigt1(id int) using iceberg]
org.projectnessie.error.NessieBadRequestException: Bad Request (HTTP/400): HTTP 413 Request Entity Too Large (through reference chain: org.projectnessie.model.ImmutableOperations$Json["operations"]->java.util.ArrayList[0]->org.projectnessie.model.ImmutablePut$Json["content"]->org.projectnessie.model.ImmutableIcebergTable$Json["metadata"]->org.projectnessie.model.ImmutableGenericMetadata$Json["metadata"])
	at org.projectnessie.error.ErrorCode.lambda$asException$1(ErrorCode.java:61)
	at java.util.Optional.map(Optional.java:215)
	at org.projectnessie.error.ErrorCode.asException(ErrorCode.java:61)
	at org.projectnessie.client.rest.ResponseCheckFilter.checkResponse(ResponseCheckFilter.java:56)
	at org.projectnessie.client.rest.NessieHttpResponseFilter.filter(NessieHttpResponseFilter.java:34)
	at org.projectnessie.client.http.HttpRequest.lambda$executeRequest$3(HttpRequest.java:157)
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	at org.projectnessie.client.http.HttpRequest.executeRequest(HttpRequest.java:157)
	at org.projectnessie.client.http.HttpRequest.post(HttpRequest.java:196)
	at org.projectnessie.client.http.HttpTreeClient.commitMultipleOperations(HttpTreeClient.java:191)

detailed_callstack.txt

Note: Found these HTTP limits from Quarkus doc. https://quarkus.io/guides/http-reference#http-limits-configuration

Tried adding these to application.properties and verified that these properties are loaded from UI.

quarkus.http.limits.max-body-size=50240K
quarkus.http.limits.max-header-size=50240K
quarkus.http.limits.max-chunk-size=50240K
quarkus.http.limits.max-form-attribute-size=50240K
quarkus.http.limits.max-initial-line-length=50240000

But still the same failure even with this configuration tuning. cc: @dimas-b , @snazy

ajantha-bhat avatar Oct 12 '22 07:10 ajantha-bhat

Reproduced with IT now. https://github.com/ajantha-bhat/nessie/commit/76023df5a2cb74da98fbe266025b4ec1eabc56ba

ajantha-bhat avatar Oct 12 '22 08:10 ajantha-bhat

@ajantha-bhat : Could the "entity too large" exception be coming the JSON parser?

dimas-b avatar Oct 12 '22 14:10 dimas-b

@ajantha-bhat : Could the "entity too large" exception be coming the JSON parser?

Because it is HTTP 413 error code, I assumed it is HTTP error instead of JSON parser.

ajantha-bhat avatar Oct 12 '22 14:10 ajantha-bhat

Site note: Nessie should probably surface the 413 error in this case, but it looks like the client is getting 400 instead.

dimas-b avatar Oct 12 '22 14:10 dimas-b

Surprisingly, If I disable the default gZIP compression, the test case passes.

But I didn't find any HTTP compression size-related configurations. There is one for RestEasy(quarkus.resteasy.gzip.max-input=10M)

ajantha-bhat avatar Oct 19 '22 04:10 ajantha-bhat