Query fails when using JDBC driver
Disclaimer: I am not sure if it is a JDBC or a Spatial extension problem. It looks to me that it is more a JDBC problem, but if I'm wrong, let me know and I'll post the issue on duckdb Spatial instead.
Environment
DuckDB version: 1.2.2.0
Java version: openjdk 21.0.7
OS: Linux amd64 (Fedora 41)
What happens
I've got a SQL query involving a spatial operator that works under CLI environment, but fails when executed with JDBC driver:
SELECT ST_GeomFromText('POLYGON ((-180 -90, -180 90, 180 90, 180 -90, -180 -90))') AS geom;
I've checked this query works in CLI :
❯ duckdb
v1.2.2 7c039464e4
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D INSTALL SPATIAL;
D LOAD SPATIAL;
D SELECT ST_GeomFromText('POLYGON ((-180 -90, -180 90, 180 90, 180 -90, -180 -90))') AS geom;
┌──────────────────────────────────────────────────────────┐
│ geom │
│ geometry │
├──────────────────────────────────────────────────────────┤
│ POLYGON ((-180 -90, -180 90, 180 90, 180 -90, -180 -90)) │
└──────────────────────────────────────────────────────────┘
How to reproduce
The same query demonstrated above fails when using the JDBC driver:
import java.sql.DriverManager;
import java.sql.SQLException;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertTrue;
public class GeomFromTextTest {
@Test
public void ExecuteQuery() throws SQLException {
try (var conn = DriverManager.getConnection("duckdb:")) {
var sql = conn.createStatement();
sql.execute("INSTALL SPATIAL;");
sql.execute("LOAD SPATIAL;");
var result = sql.executeQuery("SELECT ST_GeomFromText('POLYGON ((-180 -90, -180 90, 180 90, 180 -90, -180 -90))') AS geom");
assertTrue(result.next());
}
}
}
The error is :
java.sql.SQLException: Invalid Input Error: Expected character: '')'' at position '20' near: 'POLYGON ((-180 -90, -'|<---
at org.duckdb.DuckDBNative.duckdb_jdbc_execute(Native Method)
at org.duckdb.DuckDBPreparedStatement.execute(DuckDBPreparedStatement.java:148)
at org.duckdb.DuckDBPreparedStatement.execute(DuckDBPreparedStatement.java:127)
at org.duckdb.DuckDBPreparedStatement.executeQuery(DuckDBPreparedStatement.java:180)
at org.duckdb.DuckDBPreparedStatement.executeQuery(DuckDBPreparedStatement.java:208)
NOTE: I've got a test project that runs this query in either a simple or a prepared statement, with the same outcome.
The test project is located here: Github Geomatys: DuckDB java issues.
You can run and check the associated test class: ./gradlew test --tests=GeomFromTextTest
Hi, thanks for the report! Just your test runs fine for me for some reason:
$ git clone https://github.com/Geomatys/duckdb-java-issues.git
$ cd duckdb-java-issues/
$ export JAVA_HOME=/usr/lib/jvm/java-21
$ ./gradlew test --tests=GeomFromTextTest
BUILD SUCCESSFUL in 5s
3 actionable tasks: 3 executed
Hi, thanks for the report! Just your test runs fine for me for some reason:
Hi, thanks for the quick feedback.
Indeed, I just made the following test:
- Test on a fresh clone on my work computer on Fedora (to ensure I've not any cache or any unpushed changes) -> Tests fail as described in the issue
- Test on a fresh clone on an Ubuntu VM -> Tests pass.
It looks like there's something related to the OS, or at least to the execution environment. I'll search a bit more and will post an update here later.
I've found the problem : forcing the locale to english fix the test on my workstation (which is french by default):
❯ LANG=en_US.UTF-8 ./gradlew test --rerun --tests=GeomFromTextTest
BUILD SUCCESSFUL in 1s
3 actionable tasks: 1 executed, 2 up-to-date
So, there's a problem of locale dependant decoding. It is likely a Spatial extension problem then, that might try to decode numbers using system locale, whereas for this format, the number decoding should be independant from the locale.
What I do not understand is why I do not encounter problem when testing directly through the CLI.
Does DuckDB handles locales ? Should the JDBC driver force english locale, or is it a spatial extension problem (that should not rely on locale based decoding in this case) ?
EDIT: If I understand correctly, duckdb-spatial uses C++ standard strtod function to decode numbers for Geometry WKT (see here), and this function is dependant on system locale, as its documentation states (emphasis mine):
decimal digits optionally containing decimal-point character (as determined by the current C locale)
@alexismanin
Thanks for this analysis! I can confirm these findings.
What I do not understand is why I do not encounter problem when testing directly through the CLI.
glibc will initialize the process locale to "C" by default, "because ISO C says that all programs start by default in the standard ‘C’ locale" (reference).
Does DuckDB handles locales?
No, it does not. It also does not use locale-dependent functions like std::strtod in core codebase.
Should the JDBC driver force english locale
It cannot. JDBC driver is just another loadable library in JVM process. JVM calls setlocale on startup to set the glibc locale from OS environment. After that point it is unsafe to call setlocale to change the configured locale because these call is not tread-safe, it changes the global state and will affect all other libraries loaded into the same JVM process. So we cannot touch it from the JDBC driver.
is it a spatial extension problem (that should not rely on locale based decoding in this case) ?
Yes, I think so.
@szarnyasg I wonder if you can take a look at this issue? Whether it makes sense to move it to duckdb/duckdb-spatial.
The fix is more or less clear (replace std::strtod usage with something locale independent), but parsing decimals is a non-trivial task and it is not obvious what to use instead.
Yes! Like mentioned above this is related to strtod. We could try to use duckdbs own code for decimal parsing, although that would introduce a pretty ugly dependency between spatials geometry library and duckdbs vendored libs.
Ill investigate further what the best course of action would be.
This will be fixed in DuckDB v1.5 as ST_GeomFromText will be part of core and use DuckDBs own locale-independent decimal parsing functions.