pgjdbc icon indicating copy to clipboard operation
pgjdbc copied to clipboard

EOFException on PreparedStatement#toString with unset bytea parameter since 42.7.4

Open MrEasy opened this issue 1 year ago • 16 comments

Description After update from 42.7.3 to 42.7.4 we experienced a regression, which showed-up the error pasted below. Looking into it, this seems to be a side-effect of fixing https://github.com/pgjdbc/pgjdbc/issues/3169 Scenario:

  • Table with column of type bytea.
  • Creating a PreparedStamtent, e.g. PreparedStatement ps = connection.prepareStatement("INSERT INTO tTable VALUES (?,?)");
  • Invoking a ps.toString(); on the PreparedStatement object without having set the bind variables then shows error below. Seems the parameters are tried to get accessed despite not being set (code location)

Driver Version 42.7.4

Java Version 21.0.4 (probably not related to the issue)

PostgreSQL Version 13.6 (probably not related to the issue)

Logs

org.postgresql.util.PSQLException: Unable to convert bytea parameter at position 0 to literal
	at org.postgresql.core.v3.SimpleParameterList.toString(SimpleParameterList.java:262) ~[?:?]
	at org.postgresql.core.NativeQuery.toString(NativeQuery.java:58) ~[?:?]
	at org.postgresql.core.v3.SimpleQuery.toString(SimpleQuery.java:58) ~[?:?]
	at org.postgresql.jdbc.PgPreparedStatement.toString(PgPreparedStatement.java:1081) ~[?:?]
	at java.lang.String.valueOf(String.java:4465) ~[?:?]
	at some.code.doing.PreparedStatementToString.toString(PreparedStatementToString.java:111) ~[?:?]
	...
Caused by: java.io.EOFException: Premature end of input stream, expected 1,057 bytes, but only read 0.
	at org.postgresql.util.PGbytea.toPGLiteral(PGbytea.java:199) ~[?:?]
	at org.postgresql.core.v3.SimpleParameterList.toString(SimpleParameterList.java:253) ~[?:?]
	... 102 more

MrEasy avatar Aug 26 '24 08:08 MrEasy

Seems odd to call toString without setting the parameters, but it appears to be a bug Thanks for the report

davecramer avatar Aug 26 '24 10:08 davecramer

Agree, it is a bit odd - background: #toString called by ConcurrentSkipListSet, into which the PreparedStatements are inserted for a prepared-statement-cache. Thanks for looking into it.

MrEasy avatar Aug 26 '24 11:08 MrEasy

What is the purpose of calling toString there then?

vlsi avatar Aug 26 '24 11:08 vlsi

What is the purpose of calling toString there then?

The ConcurrentSkipListMap does this for comparing the entries (PLStatement below is a wrapper around PreparedStatement)- not knowing of the characteristics of the object being put into it of course.

org.postgresql.util.PSQLException: Unable to convert bytea parameter at position 0 to literal
	at org.postgresql.core.v3.SimpleParameterList.toString(SimpleParameterList.java:262) ~[?:?]
	at org.postgresql.core.NativeQuery.toString(NativeQuery.java:58) ~[?:?]
	at org.postgresql.core.v3.SimpleQuery.toString(SimpleQuery.java:58) ~[?:?]
	at org.postgresql.jdbc.PgPreparedStatement.toString(PgPreparedStatement.java:1081) ~[?:?]
	at java.lang.String.valueOf(String.java:4465) ~[?:?]
	at com.xxx.locator.datasources.PLStatement.toString(PLStatement.java:756) ~[?:?]
	at com.xxx.locator.datasources.PLStatement.compareTo(PLStatement.java:536) ~[?:?]
	at com.xxx.locator.datasources.PLStatement.compareTo(PLStatement.java:26) ~[?:?]
	at java.util.concurrent.ConcurrentSkipListMap.cpr(ConcurrentSkipListMap.java:393) ~[?:?]
	at java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:646) ~[?:?]
	at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1787) ~[?:?]
	at java.util.concurrent.ConcurrentSkipListSet.add(ConcurrentSkipListSet.java:243) ~[?:?]

MrEasy avatar Aug 26 '24 12:08 MrEasy

Do you control the cache? Have you tried going without the cache?

vlsi avatar Aug 26 '24 12:08 vlsi

By the way, it looks like the statement has InputStream for a bytea placeholder, so .toString() consumes the stream, and subsequent .toString() calls fail as they can no longer consume the stream.

Previously, toString() returned ? for bytea, so it was not an issue. It looks like the solution here would be to detect .toString() and store all the corresponding InputStreams buffer so we can replay them.


In any case, it looks like adding statements to ConcurrentSkipListSet does not sound good, and I believe it should be removed.

PLStatement below is a wrapper around PreparedStatement)- not knowing of the characteristics of the object being put into it of course

If you do not care about the contents of PreparedStatement (.toString() does not compare the statements perfectly anyway), you could rather use System.identityHashCode(preparedStatement) for the statement comparison. It would yield a stable comparison, and it would not cause excessive resource consumption within .toString().

However, I still think disabling the statement cache would be the right thing as it would reduce CPU consumption, it would reduce memory consumption, and it would probably improve the performance at the same time.

vlsi avatar Aug 26 '24 12:08 vlsi

Thanks for the input rg. the cache - yes will look into it. But it's just an example, basically any developer might decide to call #toString on an object - for whatever reason, could be some debugging code, printing it to log - so should not fail of course.

MrEasy avatar Aug 27 '24 14:08 MrEasy

storing the input stream seems like an awful lot of overhead just to allow calls to toString

davecramer avatar Aug 28 '24 18:08 davecramer

The following code

PreparedStatement pstmt = con.prepareStatement("INSERT INTO bytetable (id, b) VALUES (?,?)");

prints

INSERT INTO bytetable (id, b) VALUES (?,?)

I'm guessing something else needs to be done to evoke this exception ?

davecramer avatar Aug 28 '24 18:08 davecramer

By the way, it looks like the statement has InputStream for a bytea placeholder, so .toString() consumes the stream, and subsequent .toString() calls fail as they can no longer consume the stream.

Previously, toString() returned ? for bytea, so it was not an issue. It looks like the solution here would be to detect .toString() and store all the corresponding InputStreams buffer so we can replay them.

Do we really want to output the input stream in toString()? Seems to me that outputting ? would be fine.

Dave

davecramer avatar Aug 28 '24 19:08 davecramer

@davecramer you are right, a simple prepare + toString does not reproduce it. It actually has to get invoced twice and also probably the values assigned. The following added test case reproduces the issue:

diff --git a/pgjdbc/src/test/java/org/postgresql/test/jdbc2/PreparedStatementTest.java b/pgjdbc/src/test/java/org/postgresql/test/jdbc2/PreparedStatementTest.java
--- a/pgjdbc/src/test/java/org/postgresql/test/jdbc2/PreparedStatementTest.java	(revision e33be5c0481c22f4242a5d7ef2d2c09c8a17179f)
+++ b/pgjdbc/src/test/java/org/postgresql/test/jdbc2/PreparedStatementTest.java	(date 1724921450096)
@@ -7,6 +7,7 @@
 
 import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
 import static org.junit.Assert.assertNull;
 import static org.junit.Assert.assertThrows;
 import static org.junit.Assert.assertTrue;
@@ -236,6 +237,28 @@
     pstmt.executeUpdate();
     pstmt.close();
   }
+
+  @Test
+  public void testToStringOnPreparedStatement() throws SQLException {
+    PreparedStatement pstmt =
+        con.prepareStatement("INSERT INTO streamtable VALUES (?,?)");
+
+    byte[] buf = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
+    ByteArrayInputStream byteStream = new ByteArrayInputStream(buf);
+
+    pstmt.setBinaryStream(1, byteStream, buf.length);
+    pstmt.setString(2, "test");
+
+    String pstmtString = pstmt.toString();
+    assertNotNull(pstmtString);
+
+    // 2nd invoke of #tostring reproduces #3365
+    // throws exception: org.postgresql.util.PSQLException: Unable to convert bytea parameter at position 0 to literal
+    pstmtString = pstmt.toString();
+    assertNotNull(pstmtString);
+
+    pstmt.close();
+  }
 
   @Test
   public void testTrailingSpaces() throws SQLException {

MrEasy avatar Aug 29 '24 08:08 MrEasy

So back to my question.

Do we really want to output the information in the InputStream when toString() is called. This is potentially a very large byte array.

The previous behaviour was to output ? for the data.

davecramer avatar Aug 29 '24 11:08 davecramer

I think it should be fine to skip the input stream in statement.toString

vlsi avatar Aug 29 '24 11:08 vlsi

I think it should be fine to skip the input stream in statement.toString

Perfectly fine from my user perspective. Care for a PR for the test-case above?

MrEasy avatar Aug 29 '24 12:08 MrEasy

I started something https://github.com/pgjdbc/pgjdbc/pull/3368 but it appears to be more complicated. Feel free to have a go

davecramer avatar Aug 29 '24 12:08 davecramer

Only the test case - I got no fix :)

MrEasy avatar Aug 29 '24 13:08 MrEasy

@davecramer Please have a look at this alternative to the solution in #3368

kimjand avatar Nov 18 '24 19:11 kimjand

Bit sad, that the change did not make it in yesterday's release, as this is preventing us from updating the driver. From what I see a fix is there, but needs a review?

MrEasy avatar Jan 15 '25 10:01 MrEasy

BTW here is an example where MSJDBC has implemented tostring for (better) JPA Slow-execution logging (which also uses PS#toString(), at least in hibernate case)

https://github.com/microsoft/mssql-jdbc/issues/2086

So it’s a reasonable expectation that tostring is used, should not corrupt the Parameters and allows at least a rough idea what Statement is (keep ? Placeholders is Fine) in question (it doesnt make a stable Cache key function Mandates, though, but thats not what the bug is about).

ecki avatar Jan 22 '25 17:01 ecki

Before I look at the link above can you tell me the license ? Also PR's are welcome

davecramer avatar Jan 22 '25 18:01 davecramer

MSJDBC is MIT license, but the code is not that interesting, it’s just the fact that other drivers also print the Statement, for good reasons like Hibernate/JPA logging or just debugging.

ecki avatar Jan 22 '25 19:01 ecki

PR is here: https://github.com/pgjdbc/pgjdbc/pull/3369 as @kimjand pointed out in comment above

MrEasy avatar Jan 22 '25 21:01 MrEasy

PR needs work, can it be rebased to start with ?

davecramer avatar Jan 22 '25 21:01 davecramer

PR needs work, can it be rebased to start with ?

@davecramer Updated https://github.com/pgjdbc/pgjdbc/pull/3369 @kimjand adapted your changes to main My testcase (part of PR) passes with them

MrEasy avatar Jan 27 '25 08:01 MrEasy

Hello there.

I think I encountered this very problem in slightly different context.

For me a batch insert is failing when one of the statements in the batch raises an exception (in my case primary key violation). When this happens it is the PostgreSQL JDBC driver that internally invokes PreparedStatement.toString() in BatchResultHandler.java#L156.

I was able to bypass the issue by setting logServerErrorDetail=false in JDBC URL but it is not ideal.

I am sharing this to illustrate the issue described here might be more severe than it appears to be. It is not strictly limited to end user call of PreparedStatement.toString(). The PostgreSQL JDBC driver itself makes such a call in BatchResultHandler and perhaps even someplace else.

miroslav-vacek avatar Feb 26 '25 14:02 miroslav-vacek

I'd like to mention that this is also affecting PuppetDB, and I also opened a bug report in Debian to track this issue: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1098830

jcharaoui avatar Feb 26 '25 14:02 jcharaoui

Interesting, thanks for the report.

davecramer avatar Feb 26 '25 14:02 davecramer

We have also this issue with the use of Tomcat JDBCStore.

Image

PenghaiZhang avatar Mar 20 '25 06:03 PenghaiZhang

While I definitely don't know pgjdbc well enough to have a good sense whether this is a reasonable fix for now, and even if it were, I'd guess it's too naive, this patch (given @davecramer's question) does appear to fix at least the PuppetDB issue:

diff --git a/pgjdbc/src/main/java/org/postgresql/core/v3/SimpleParameterList.java b/pgjdbc/src/main/java/org/postgresql/core/v3/SimpleParameterList.java
index 30528ee7..b318c665 100644
--- a/pgjdbc/src/main/java/org/postgresql/core/v3/SimpleParameterList.java
+++ b/pgjdbc/src/main/java/org/postgresql/core/v3/SimpleParameterList.java
@@ -250,21 +250,7 @@ class SimpleParameterList implements V3ParameterList {
     String textValue;
     String type;
     if (paramTypes[index] == Oid.BYTEA) {
-      try {
-        return PGbytea.toPGLiteral(paramValue);
-      } catch (Throwable e) {
-        Throwable cause = e;
-        if (!(cause instanceof IOException)) {
-          // This is for compatibilty with the similar handling in QueryExecutorImpl
-          cause = new IOException("Error writing bytes to stream", e);
-        }
-        throw sneakyThrow(
-            new PSQLException(
-                GT.tr("Unable to convert bytea parameter at position {0} to literal",
-                    index),
-                PSQLState.INVALID_PARAMETER_VALUE,
-                cause));
-      }
+        return "?";
     }
     if ((flags[index] & BINARY) == BINARY) {
       // handle some of the numeric types

Even if the change is inadequate upstream, it'd be useful to know if it might be acceptable for say Debian in the shorter term because the regression has complicated getting the newer PuppetDB into the next Debian stable release before the freeze (happening now).

Thanks for any help

rbrw avatar Apr 01 '25 17:04 rbrw

Would anyone be unhappy with @rbrw 's patch applied? If this doesn't get fixed upstream, I might put the patch into the Debian package to get the PuppetDB issue resolved.

df7cb avatar Apr 10 '25 09:04 df7cb