iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Encoding of Namespace having space to +

Open mthirani2021 opened this issue 3 months ago • 6 comments

Apache Iceberg version

1.10.0 (latest release)

Query engine

Spark

Please describe the bug 🐞

Problem: Trying to create a Namespace (NS) with a space in Polaris Catalog via Iceberg REST Utils. Loading the Namespace Metadata from Spark, throws error "Schema Not Found"/ "Namespace does not exists". The issue exists with any Query Engine.

How to Reproduce it from Spark:

  1. Start the Polaris Catalog and Spark env - https://github.com/apache/polaris/tree/main/getting-started/minio
  2. Execute below SQLs via Spark env:
spark-sql ()> create namespace `a b`;
Time taken: 0.451 seconds
spark-sql ()> show namespaces;
`a b`
spark-sql ()> use `a b`;
[SCHEMA_NOT_FOUND] The schema `a b` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.

In Polaris Log: 2025-10-06 17:25:48,574 INFO [io.qua.htt.access-log] [bd768622-4283-40df-8ab2-434f358112b5_0000000000000000009,POLARIS] [,,,] (executor-thread-1) 127.0.0.1 - root [06/Oct/2025:17:25:48 -0400] "GET /api/catalog/v1/polaris/namespaces/a+b HTTP/1.1" 404 98

Actual encoding of the space to + is happened at this LOC in below Iceberg library and looks like this was a pre-existing bug to me.

iceberg/core/src/main/java/org/apache/iceberg/rest/RESTUtil.java at main · apache/iceberg

iceberg/core/src/main/java/org/apache/iceberg/rest/RESTUtil.java at main · apache/iceberg

FYI: Can still create NS in Polaris Catalog which has a space in it even it complains that. So I tried using REST API in plain and it worked:

Image

Willingness to contribute

  • [ ] I can contribute a fix for this bug independently
  • [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • [x] I cannot contribute a fix for this bug at this time

mthirani2021 avatar Oct 06 '25 18:10 mthirani2021

@dimas-b @omarsmak

mthirani2021 avatar Oct 06 '25 18:10 mthirani2021

@nastra: Thoughts?

ajantha-bhat avatar Nov 28 '25 05:11 ajantha-bhat

It's not entirely clear whether this issue is caused by Iceberg itself. So it would be good to have a reproducible test in Iceberg for this in CatalogTests/TestRESTUtil but also in a Spark test

nastra avatar Nov 28 '25 11:11 nastra

I took a quick look and the issue isn't at the REST layer. When a namespace with a space is URL encoded, you'll get a+b and when it's URL decoded, you'll get the original a b namespace. However, not every engine actually supports spaces in namespace names. For example, Hive and the SparkSessionCatalog don't support spaces as can be seen in this validation:

[INVALID_SCHEMA_OR_RELATION_NAME] `a b` is not a valid name for tables/schemas. Valid names only contain alphabet characters, numbers and _. SQLSTATE: 42602
org.apache.spark.sql.AnalysisException: [INVALID_SCHEMA_OR_RELATION_NAME] `a b` is not a valid name for tables/schemas. Valid names only contain alphabet characters, numbers and _. SQLSTATE: 42602
	at org.apache.spark.sql.errors.QueryCompilationErrors$.invalidNameForTableOrDatabaseError(QueryCompilationErrors.scala:1143)
	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.validateName(SessionCatalog.scala:155)
	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.createDatabase(SessionCatalog.scala:293)
	at org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog.createNamespace(V2SessionCatalog.scala:424)
	at org.apache.iceberg.spark.SparkSessionCatalog.createNamespace(SparkSessionCatalog.java:123)
	at org.apache.spark.sql.execution.datasources.v2.CreateNamespaceExec.run(CreateNamespaceExec.scala:49)

Failed to create namespace a b in Hive Metastore
java.lang.RuntimeException: Failed to create namespace a b in Hive Metastore
	at org.apache.iceberg.hive.HiveCatalog.createNamespace(HiveCatalog.java:507)
	at org.apache.iceberg.spark.SparkCatalog.createNamespace(SparkCatalog.java:499)
	at org.apache.spark.sql.execution.datasources.v2.CreateNamespaceExec.run(CreateNamespaceExec.scala:49)

Also this works fine for me using the Iceberg quickstart: https://iceberg.apache.org/spark-quickstart/#docker-compose

spark-sql ()> use demo;
Time taken: 0.032 seconds
spark-sql ()> show schemas;
Time taken: 0.082 seconds
spark-sql ()> create schema test;
Time taken: 0.17 seconds
spark-sql ()> create table test.tbl (id int);
Time taken: 0.658 seconds
spark-sql ()> create schema `a b`;
Time taken: 0.031 seconds
spark-sql ()> show schemas;
test
`a b`
Time taken: 0.163 seconds, Fetched 2 row(s)
spark-sql ()> use `a b`;
Time taken: 0.032 seconds
spark-sql (`a b`)>
                 > create table tbl2 (id int);
Time taken: 0.061 seconds
spark-sql (`a b`)> select * from `a b`.tbl2;

nastra avatar Nov 28 '25 11:11 nastra

@nastra I don't think we are saying the issue is at the REST layer but rather we think it is at the Iceberg REST client layer in https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/RESTUtil.java#L147

omarsmak avatar Dec 09 '25 08:12 omarsmak

@omarsmak I was not able to reproduce this issue with the REST server from Iceberg's quickstart example and I don't see an error in the encoding/decoding of the REST client (see also my previous comment). Is this issue reproducible with Polaris?

nastra avatar Dec 09 '25 08:12 nastra