clickhouse-java icon indicating copy to clipboard operation
clickhouse-java copied to clipboard

Error on Initial Data Insertion into Table with JSON Column

Open BentsiLeviav opened this issue 2 years ago • 3 comments

Describe the bug

When attempting to insert data for the first time into an empty table that includes a JSON column, the process triggers an exception as follows:

Exception in thread "main" java.lang.NumberFormatException: For input string: "{}" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67) at java.base/java.lang.Integer.parseInt(Integer.java:668) at com.clickhouse.data.value.UnsignedByte.valueOf(UnsignedByte.java:78) at com.clickhouse.data.value.UnsignedByte.valueOf(UnsignedByte.java:61) at com.clickhouse.data.value.ClickHouseByteValue$UnsignedByteValue.update(ClickHouseByteValue.java:93) at com.clickhouse.data.value.ClickHouseByteValue$UnsignedByteValue.update(ClickHouseByteValue.java:16) at com.clickhouse.data.ClickHouseValue.update(ClickHouseValue.java:1236) at com.clickhouse.data.value.ClickHouseByteValue.update(ClickHouseByteValue.java:408) at com.clickhouse.data.value.ClickHouseByteValue$UnsignedByteValue.update(ClickHouseByteValue.java:16) at com.clickhouse.data.format.ClickHouseRowBinaryProcessor$TupleSerializer.serialize(ClickHouseRowBinaryProcessor.java:243) at com.clickhouse.data.ClickHouseDataProcessor.write(ClickHouseDataProcessor.java:593) at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.addBatch(InputBasedPreparedStatement.java:345) at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeAny(InputBasedPreparedStatement.java:113) at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeLargeUpdate(InputBasedPreparedStatement.java:197) at com.clickhouse.jdbc.internal.AbstractPreparedStatement.executeUpdate(AbstractPreparedStatement.java:135) at org.example.Main.main(Main.java:37)

Expected behavior

Expected functionality entails the successful insertion of JSON into an empty table.

Cause

This issue arises from the underlying behavior of ClickHouse's JSON representation, which is based on Tuples. If no JSON data has been previously inserted, the default type inferred is a Tuple of Int8. Consequently, the JDBC driver encounters difficulty in serializing the JSON string, leading to the reported error.

Current Process

The driver currently deduces the schema by executing the query: SELECT column FROM table WHERE 0. However, the ClickHouse team has suggested an alternative schema inference mechanism utilizing the query: desc table tableName format TSVWithNamesAndTypes. This revised approach returns JSON as a data type, rather than Tuple<Int8>.

Steps to reproduce

  1. Create the following table
CREATE TABLE public.tst
(
    `id` String,
    `o` Object('json')
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
PRIMARY KEY id
ORDER BY id
SETTINGS index_granularity = 8192
  1. Run the code example provided below

Code example


public static void main(String[] args) throws SQLException {
        String url = "jdbc:ch:https://******.clickhouse.cloud:8443/public"; // use http protocol and port 8123 by default
        Properties properties = new Properties();
        properties.setProperty("username", "default");
        properties.setProperty("password", "password");
        ClickHouseDataSource dataSource = new ClickHouseDataSource(url, properties);

        try (Connection con = DriverManager
                .getConnection(url, properties)) {

            try (PreparedStatement stmt = con.prepareStatement(
                "INSERT INTO tst (id, o) VALUES (?, ?)")) {
                stmt.setString(1, "test");
                stmt.setString(2, "{}");
                stmt.executeUpdate();


            }
        }
    }

Error log

Exception in thread "main" java.lang.NumberFormatException: For input string: "{}" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67) at java.base/java.lang.Integer.parseInt(Integer.java:668) at com.clickhouse.data.value.UnsignedByte.valueOf(UnsignedByte.java:78) at com.clickhouse.data.value.UnsignedByte.valueOf(UnsignedByte.java:61) at com.clickhouse.data.value.ClickHouseByteValue$UnsignedByteValue.update(ClickHouseByteValue.java:93) at com.clickhouse.data.value.ClickHouseByteValue$UnsignedByteValue.update(ClickHouseByteValue.java:16) at com.clickhouse.data.ClickHouseValue.update(ClickHouseValue.java:1236) at com.clickhouse.data.value.ClickHouseByteValue.update(ClickHouseByteValue.java:408) at com.clickhouse.data.value.ClickHouseByteValue$UnsignedByteValue.update(ClickHouseByteValue.java:16) at com.clickhouse.data.format.ClickHouseRowBinaryProcessor$TupleSerializer.serialize(ClickHouseRowBinaryProcessor.java:243) at com.clickhouse.data.ClickHouseDataProcessor.write(ClickHouseDataProcessor.java:593) at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.addBatch(InputBasedPreparedStatement.java:345) at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeAny(InputBasedPreparedStatement.java:113) at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeLargeUpdate(InputBasedPreparedStatement.java:197) at com.clickhouse.jdbc.internal.AbstractPreparedStatement.executeUpdate(AbstractPreparedStatement.java:135) at org.example.Main.main(Main.java:37)

Configuration

Environment

  • Client version: 0.4.6
  • Language version: Java
  • OS: macOS/Linux

ClickHouse server

  • ClickHouse Server version: 23.8.1.41458
  • ClickHouse Server non-default settings, if any: allow_experimental_object_type=1
  • CREATE TABLE statements for tables involved:
CREATE TABLE public.tst
(
    `id` String,
    `o` Object('json')
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
PRIMARY KEY id
ORDER BY id
SETTINGS index_granularity = 8192

BentsiLeviav avatar Aug 24 '23 08:08 BentsiLeviav

Hi @BentsiLeviav, sorry for the inconvenience. It's a known limitation as reported in #1023. The workaround at this point is to use input function for insertion.

If majority of the users prefer to use String for insertion, I think we can enhance the driver to overcome this by adding a new option.

zhicwu avatar Aug 24 '23 09:08 zhicwu

Hi @BentsiLeviav, sorry for the inconvenience. It's a known limitation as reported in #1023. The workaround at this point is to use input function for insertion.

If majority of the users prefer to use String for insertion, I think we can enhance the driver to overcome this by adding a new option.

have any update about this issue ?

qusijun avatar Jun 25 '24 07:06 qusijun

Hi @qusijun

Unfortunately, this feature (object support) is considered experimental therefore we don't consider it as "production-ready". Anyway, could you try the solution provided by @zhicwu ?

BentsiLeviav avatar Jun 27 '24 10:06 BentsiLeviav

will be fixed with new JSON type support.

chernser avatar Jan 09 '25 21:01 chernser