duckdb
duckdb copied to clipboard
[Java] Regression in conversion of Long to Timestamp value via appender
What happens?
In 0.3.3, DuckDBAppender.append(long)
could be used to append a millisecond value to be converted to a timestamp in the resulting table.
As of 0.4.0, appending the value fails with an exception:
Not implemented Error: Unimplemented type for cast (INT64 -> TIMESTAMP)
java.sql.SQLException: Not implemented Error: Unimplemented type for cast (INT64 -> TIMESTAMP)
at org.duckdb.DuckDBNative.duckdb_jdbc_appender_append_long(Native Method)
at org.duckdb.DuckDBAppender.append(DuckDBAppender.java:47)
To Reproduce
In 0.4.0 the test fails with the above error, but in 0.3.3 it passes successfully.
TimeZone currentTimeZone = TimeZone.getDefault();
try {
TimeZone.setDefault(TimeZone.getTimeZone("UTC"));
DuckDBConnection connection = getConnection();
executeOnConnection(connection, "CREATE TABLE ts_test(ts_value timestamp)");
Instant instant = Instant.now().truncatedTo(ChronoUnit.MILLIS);
try (DuckDBAppender appender = connection.createAppender("main", "ts_test")) {
appender.beginRow();
appender.append(instant.toEpochMilli() * 1000);
appender.endRow();
}
Statement resultStatement = connection.createStatement();
ResultSet resultSet = resultStatement.executeQuery( "SELECT ts_value from ts_test");
resultSet.next();
Timestamp appendedTimestamp = resultSet.getTimestamp(1);
assertEquals(instant.toEpochMilli(), appendedTimestamp.toInstant().toEpochMilli());
resultSet.close();
resultStatement.close();
connection.close();
} finally {
// return timezone back to default.
TimeZone.setDefault(currentTimeZone);
}
Environment (please complete the following information):
- OS: Mac OS X (Apple Silicon)
- DuckDB Version: 0.4.0
- DuckDB Client: JDBC
Identity Disclosure:
- Full Name: Jonathan Swenson
- Affiliation: Omni
Before Submitting
- [x] Have you tried this on the latest
master
branch? - [x] Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
Thanks for the report! This was actually done on purpose (see #3405), as the implicit casting of internal data types made it very easy to accidentally insert bogus data using the Appender. As illustrated by your very comment:
In 0.3.3, DuckDBAppender.append(long) could be used to append a millisecond value to be converted to a timestamp in the resulting table.
Since it actually requires a microsecond value :)
In C++ we now require the explicit construction of a timestamp_t
type, using e.g. the Timestamp::FromEpochMicroSeconds
method. We should export these date/time/timestamp methods to Java as well for use within the Appender.
Oops yes -- microsecond value.
In C++ we now require the explicit construction of a timestamp_t type, using e.g. the Timestamp::FromEpochMicroSeconds method. We should export these date/time/timestamp methods to Java as well for use within the Appender.
I've been working around some of the limitations of the java appender (appending timestamps from longs, big decimals from strings). Would augmenting the appender to support these types directly -- append(BigDecimal)
, append(Timestamp)
be preferable? Otherwise, it seems like a trycast implementation from int64 to timetamp_t using Timestamp::FromEpochMicroSeconds
was sufficient to "fix" this.
Augmenting to support these types would definitely be preferable.
Interestingly #3841 is kind of the opposite issue of this one. We used to always require appending the exact internal types for the Appender
- which is the behavior that you are observing there with the decimal types.
#3405 changes this to do proper casting, which makes the behavior of the appender a lot more clear, except for decimal types (which were left out for now because there were some problems in other parts of the code that relied on this where fixing was non-trivial).
I can't say it surprises me that decimal and timestamp types are ... weird.
One thing I did like about the long -> timestamp method of appending a value is that I didn't have to create additional objects here in order to load a stream of data in. In order to differentiate and having an append(Timestamp)
method one will have to create / allocate a short lived object (either a java.sql.Timestamp
or DuckDBTimestamp
) if you have the long values already.
Would it make sense to use DuckDBTimestamp
for this purpose? or is that class more of an implementation detail?
Given that the system does support these casts in some shape (strings seem to work for most cases) it would probably make sense that you could do either in the long run.
method one will have to create / allocate a short lived object (either a java.sql.Timestamp or DuckDBTimestamp) if you have the long values already.
Would the JVM not optimize this out in a loop? At least in C++ there is zero cost to wrapping a scalar in a struct.
I think having a separate class would be the cleanest from a code perspective, as it would be very clear what the code is doing. We could also add e.g. an appendTimestampMicros
method (or similar) that would be more explicit in what it is appending and do the actual conversion in C++. We could also add an appendInternal
method that would append using the internal type instead.
I would prefer not to actually add a numeric -> timestamp
cast to the system, since in my opinion a query like this should fail: SELECT 1::TIMESTAMP
Would the JVM not optimize this out in a loop?
Good point. Hopefully 🙂
I like the appendTimestampMicros
and appendTimestamp
that both call a private appendTimestampInternal
that could create and append the appropriate timestamp_t value.
think having a separate class would be the cleanest from a code perspective
You'd rather a distinct, purpose specific class for each of these? DuckDBAppenderTimestamp / Date / Time or something to that tune? The time classes are a bit of a nightmare, but I could definitely see the argument for using Instant or java.sql.Timestamp (especially because this IS kind of a jdbc driver).
I would prefer not to actually add a numeric -> timestamp cast to the system
Initially I was looking into doing the cast based implementation (with an implementation of TryCast::Operation(int64_t input, timestamp_t &result, bool strict)
Just so happens that the SELECT 1::TIMESTAMP
or SELECT CAST(CAST(1 AS LONG) AS TIMESTAMP)
still fails due to the difference between the int64_t and bigint / integer.
But I see your point.
I like the appendTimestampMicros and appendTimestamp that both call a private appendTimestampInternal that could create and append the appropriate timestamp_t value.
That sounds good to me.
You'd rather a distinct, purpose specific class for each of these? DuckDBAppenderTimestamp / Date / Time or something to that tune? The time classes are a bit of a nightmare, but I could definitely see the argument for using Instant or java.sql.Timestamp (especially because this IS kind of a jdbc driver).
It would not need to be restricted only to the Appender, but indeed these classes might not be used for anything else in the JDBC. Accepting Java classes seems like a good idea regardless.
Indeed we should just support appending a java.sql.Timestamp
Any workaround for inserting timestamps through the appender?
0.6.1
is returning the following error when trying to insert the time as microseconds double
Not implemented Error: Unimplemented type for cast (INT64 -> INT64)
In 0.8, I am not seeing an appendTimestamp option?
Any plans to support this?
It would help if there was a way to query the column types from an appender: https://github.com/duckdb/duckdb/discussions/8069
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.
This issue was closed because it has been stale for 30 days with no activity.
I'm running the 0.9.2 version and I get
Not implemented Error: Unimplemented type for cast (INT64 -> INT64) at org.duckdb.DuckDBNative.duckdb_jdbc_appender_append_long(Native Method)
This is the Java value I'm trying to append:
duckDBAppender.append(new DuckDBTimestamp(System.currentTimeMillis() * 1000).getMicrosEpoch());
A fix that doesn't expose error prone raw long
appenders, but achieves the same goal via DuckDBTimestamp
-> https://github.com/duckdb/duckdb/pull/10736
@jonathanswenson
Indeed we should just support appending a
java.sql.Timestamp
@hannes I agree that shouldn't be support for appending a long
directly on a TIMESTAMP
column, but java.sql.Timestamp
is a problematic type as it normalizes the internal data to the system's timezone implicitly. I opened a PR (#10736) adding DuckDBTimestamp
support and improving the java.time.LocalDateTime
support.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 30 days.