jackson-databind icon indicating copy to clipboard operation
jackson-databind copied to clipboard

Intermittent failure to serialize payloads with `boolean` field (jackson 2.15.2)

Open fc76gsk opened this issue 1 year ago • 24 comments

Search before asking

  • [X] I searched in the issues and found nothing similar.

Describe the bug

getting - failed to serialize payload error for boolean field intermittently with java liberty servers, few times a week and then all the requests are getting same error until the server is restarted. we are using java 1.8, ibm websphere application server liberty 21.0.0.9 2.15.2 version of jackson databind

Version Information

2.15.2

Reproduction

<-- Any of the following

  1. Brief code sample/snippet: include here in preformatted/code section
  2. Longer example stored somewhere else (diff repo, snippet), add a link
  3. Textual explanation: include here -->
// Your code here
public class BaseRequest{
private boolean isStatus = false;
//getters, setters, few other fields
}

The payload is not having this field being sent, but intermittently we are getting failed to serialize payload error for this and then the particular server on which it started this error will start for all the other api calls and the server becomes non-responsive, after restart of the server the server becomes normal and no error for most of the time, and then happens once a week or two weeks

Expected behavior

no serialization error

Additional context

No response

fc76gsk avatar Sep 24 '24 20:09 fc76gsk

We prefer to get cases where the problem is minimised to just involving Jackson. Noone is going to voluntarily try to reproduce your scenario which is far from fully described - which if I repeat it here, involves purchasing licenses for Websphere, setting up an instance, coding and deploying a POC and sending it messages for a week or 2 and then debugging.

Have you even looked at the possibility that some of the messages are malformed and how your code reacts to those malformed messages?

pjfanning avatar Sep 24 '24 20:09 pjfanning

The same requests work on other servers, same server after the server restart.

fc76gsk avatar Sep 24 '24 21:09 fc76gsk

It could be anything: maybe your JVM server got corrupted, maybe there's memory problem, firewall problem, could be anything. 🤷‍♂️

You should get your team's/company's senior devs to investigate this.

yihtserns avatar Sep 24 '24 21:09 yihtserns

And one additional note: 2.15.2 is not the latest version, so you'd really want to try out with either latest release (2.17.2) or pre-release (2.18.0-rc1). There may be later fixes and no one wants to waste time on chasing already fixed bugs.

cowtowncoder avatar Sep 25 '24 15:09 cowtowncoder

we had ibm check the issue and they have confirmed it is not related to jvm server or memory problem, firewall problem. we have had multiple senior developers check on this, we all think it is related to serialization, the behavior changed suddenly and the surrounding code has been the same since 2021, the same code works all the time, just during some instances this fails and then impacts all the requests on that particular server and spreads to other parent apps and other servers.

fc76gsk avatar Sep 27 '24 13:09 fc76gsk

What we need is a reproduction. Jackson-only reproduction to be more specific. Without seeing how it is used, we cannot tell what is causing the problem

JooHyukKim avatar Sep 27 '24 13:09 JooHyukKim

This is just my opinion. This issue is super hard to reproduce and there are lots of moving parts. If this was my problem, I would try replacing the code that I suspect. There are lots of alternative libraries for working with JSON. In a few weeks, you will know if replacing the code has fixed the issue. If it hasn't you need to then focus on something else to find the cause.

pjfanning avatar Sep 27 '24 13:09 pjfanning

Can you at least share the stack trace of the error?

yihtserns avatar Sep 27 '24 14:09 yihtserns

@fc76gsk Are you seriously not going to include any useful information here?!?! What the #%(%^ is "failed to serialize payload error"?

Stack trace is absolutely needed, even if no reproduction available.

Similarly "the behavior changed suddenly and the surrounding code has been the same since 2021" -- suddenly when? Upgrading Jackson version? If so, from which (and to which) version? Did you try newer versions? 2.15 hasn't been latest for a while.

I would not immediately recommend "use something other than Jackson" like @pjfanning often suggests. But without more information there's very little we can do.

cowtowncoder avatar Sep 27 '24 14:09 cowtowncoder

We are not able to reproduce this issue. Here is the stack trace: message=Service validation error. com.JavaException: txid=xxxxxxxxx: Failed to serialize payload of com.JavaClass at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1258) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:746) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:443) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1227) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1011) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:75) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:938) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.webcontainer.osgi.DynamicVirtualHost$2.run(DynamicVirtualHost.java:279) [com.ibm.ws.webcontainer_1.1.56.jar:?] at com.ibm.ws.http.dispatcher.internal.channel.HttpDispatcherLink$TaskWrapper.run(HttpDispatcherLink.java:1159) [com.ibm.ws.transport.http_1.0.56.jar:?] at com.ibm.ws.http.dispatcher.internal.channel.HttpDispatcherLink.wrapHandlerAndExecute(HttpDispatcherLink.java:428) [com.ibm.ws.transport.http_1.0.56.jar:?] at com.ibm.ws.http.dispatcher.internal.channel.HttpDispatcherLink.ready(HttpDispatcherLink.java:387) [com.ibm.ws.transport.http_1.0.56.jar:?] at com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:566) [com.ibm.ws.transport.http_1.0.56.jar:?] at com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.handleNewRequest(HttpInboundLink.java:500) [com.ibm.ws.transport.http_1.0.56.jar:?] at com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.processRequest(HttpInboundLink.java:360) [com.ibm.ws.transport.http_1.0.56.jar:?] at com.ibm.ws.http.channel.internal.inbound.HttpICLReadCallback.complete(HttpICLReadCallback.java:70) [com.ibm.ws.transport.http_1.0.56.jar:?] at com.ibm.ws.channel.ssl.internal.SSLReadServiceContext$SSLReadCompletedCallback.complete(SSLReadServiceContext.java:1824) [com.ibm.ws.channel.ssl_1.0.56.jar:?] at com.ibm.ws.tcpchannel.internal.WorkQueueManager.requestComplete(WorkQueueManager.java:504) [com.ibm.ws.channelfw_1.0.56.jar:?] at com.ibm.ws.tcpchannel.internal.WorkQueueManager.attemptIO(WorkQueueManager.java:574) [com.ibm.ws.channelfw_1.0.56.jar:?] at com.ibm.ws.tcpchannel.internal.WorkQueueManager.workerRun(WorkQueueManager.java:958) [com.ibm.ws.channelfw_1.0.56.jar:?] at com.ibm.ws.tcpchannel.internal.WorkQueueManager$Worker.run(WorkQueueManager.java:1047) [com.ibm.ws.channelfw_1.0.56.jar:?] at com.ibm.ws.threading.internal.ExecutorServiceImpl$RunnableWrapper.run(ExecutorServiceImpl.java:238) [com.ibm.ws.threading_1.1.56.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160) [?:1.8.0] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:1.8.0] at java.lang.Thread.run(Thread.java:818) [?:2.9 (09-11-2019)] Caused by: com.fasterxml.jackson.databind.JsonMappingException: (was java.lang.NullPointerException) (through reference chain: com.JavaClass["isStatus"]) at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:402) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:361) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.std.StdSerializer.wrapAndThrow(StdSerializer.java:323) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:780) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:178) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:479) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:318) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ObjectMapper._writeValueAndClose(ObjectMapper.java:4719) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:3964) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] ... 35 more Caused by: java.lang.NullPointerException at sun.reflect.GeneratedMethodAccessor961.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) ~[?:1.8.0] at java.lang.reflect.Method.invoke(Method.java:508) ~[?:1.8.0] at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:688) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:772) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:178) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:479) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:318) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ObjectMapper._writeValueAndClose(ObjectMapper.java:4719) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:3964) ~[com.fasterxml.jackson.core-jackson-databind-2.15.2.jar:2.15.2] ... 35 more

fc76gsk avatar Oct 02 '24 17:10 fc76gsk

@fc76gsk can you share the source for JavaClass?

(I think I can already guess where this is going...)

yihtserns avatar Oct 02 '24 18:10 yihtserns

This is the class

@ApiModel("JavaClass")
public class JavaClass{
@ApiModelProperty(value="status of the message", hidden=true)
private boolean isStatus = false;
//getters, setters, few other fields
}

fc76gsk avatar Oct 02 '24 21:10 fc76gsk

Why are you sharing (incomplete!!!) source code for BaseRequest, when the stack trace is about JavaClass?

yihtserns avatar Oct 02 '24 21:10 yihtserns

this is the same class, updated the name, pls check

fc76gsk avatar Oct 02 '24 21:10 fc76gsk

Please show the getter and setter. And are you sure the type is boolean not Boolean?

yihtserns avatar Oct 02 '24 21:10 yihtserns

@apimodel("JavaClass")
public class JavaClass {
@ApiModelProperty(value="status of the message", hidden=true)
private boolean isStatus = false;
private boolean useEmail = false;

 public boolean isStatus() {
    return isStatus;
 }

public void setIsStatus(boolean isStatus) {
    this.isStatus = isStatus;
}

public boolean isUseEmail() {
    return useEmail;
}

public void setUseEmail(boolean useEmail) {
    this.useEmail = useEmail;
}
}

fc76gsk avatar Oct 02 '24 21:10 fc76gsk

So it wasn't due to unwrapping a null wrapper to primitive, which would've resulted in different exception thrown:

Caused by: java.lang.NullPointerException at <..snip>.MyBean.isStatus(MyBean.java:31) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566)

...while yours is:

Caused by: java.lang.NullPointerException at sun.reflect.GeneratedMethodAccessor961.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) ~[?:1.8.0] at java.lang.reflect.Method.invoke(Method.java:508) ~[?:1.8.0]

What exactly is that sun.reflect.GeneratedMethodAccessor961??? How does a decompiled JavaClass.class looks like? Is it exactly the same as JavaClass.java? ~~This can't be the case of runtime bytecode transformation doing weird stuff, can it?~~

UPDATE ~~Seems like GeneratedMethodAccessor* has something to do with dynamically generated classes from deserialization or a dynamic proxy?~~

yihtserns avatar Oct 02 '24 22:10 yihtserns

I'm suspecting that the thing being serialised is a proxy/bytecode-manipulated and sometimes it goes bad (eg ThreadLocal problem, or other causes?).

What I recommend is to add some code to call .isStatus() e.g. printing the result to System.out or some Logger just before passing it to Jackson for serialization - I suspect that call will (when the time is right) throw NPE, proving it wasn't a Jackson problem.

That's all I can offer with my limited knowledge. Maybe other people have better thoughts.

yihtserns avatar Oct 02 '24 23:10 yihtserns

At this point we should probably stop @fc76gsk from throwing out questions, without giving answers to the questions asked 🤔

JooHyukKim avatar Oct 03 '24 01:10 JooHyukKim

While vague, it sounds more like something in JDK is having rare issues with Reflection-based Field accessors. If so, it'd unfortunately also be outside of anything Jackson has control over. And I would probably suggest seeing if the patch version of JDK is the latest (it looked like it's Java 8 which is fine)

cowtowncoder avatar Oct 03 '24 02:10 cowtowncoder

just to make it clear again to @JooHyukKim we are unable to reproduce this, this happens for some time and when we restart the server it stops, all the same requests work, what other questions do you want me to answer here, these are provided in the initial issue already

fc76gsk avatar Oct 03 '24 16:10 fc76gsk

At this point I am unfortunately not sure what we could do. Symptoms do not point to anything Jackson could do, change, nor is there a reproduction

As I said, I would look into JDK version, ensure it's the latest. Similarly I would consider newer Jackson versions -- at very least 2.15.4 patch if keeping to 2.15 is required.

Will leave open for a bit in case anyone has any ideas, suggestions.

cowtowncoder avatar Oct 03 '24 23:10 cowtowncoder

Discoveries

So I got curious about GeneratedMethodAccessorXXX, and this is what I learned:

  1. GeneratedMethodAccessorXXX classes are dynamically created by NativeMethodAccessorImpl when .invoke(...) is called on the same Method instance many times: https://github.com/openjdk/jdk8u/blob/e32d62e2a39510f643b32d615e76f0ff9be3d9f3/jdk/src/share/classes/sun/reflect/NativeMethodAccessorImpl.java#L49-L60
    1. They will be created on the 16th Method.invoke(...) call - here's a sample of how they look like:
    public class GeneratedMethodAccessor2 extends MethodAccessorImpl {
      public GeneratedMethodAccessor2() {
      }
    
      public Object invoke(Object var1, Object[] var2) throws InvocationTargetException {
          Boolean var10000 = new Boolean;
          if (var1 == null) {
              throw new NullPointerException();
          } else {
              TestBean var10002;
              try {
                  var10002 = (TestBean)var1;
                  if (var2 != null && var2.length != 0) {
                      throw new IllegalArgumentException();
                  }
              } catch (NullPointerException | ClassCastException var4) {
                  throw new IllegalArgumentException(var4.toString());
              }
    
              boolean var5;
              try {
                  var5 = var10002.isStatus();
              } catch (Throwable var3) {
                  throw new InvocationTargetException(var3);
              }
    
              var10000.<init>(var5);
              return var10000;
          }
      }
    }
    
    1. They will then be used on the 17th Method.invoke(...) call by DelegatingMethodAccessorImpl.
  2. If an exception is thrown by the called accessor method (e.g. NPE thrown trying to convert a null Boolean wrapper to primitive), the accessor method itself will be included in the stack trace:
Caused by: java.lang.NullPointerException
    at <..snip>TestBean.isStatus(NPEGetterTest.java:33)
    at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:688)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:772)
  1. One way to make GeneratedMethodAccessorXXX throw NPE is by Jackson passing null as first arg (i.e. bean) to Method.invoke(...) here: https://github.com/FasterXML/jackson-databind/blob/74a57b2f6cd3b6e0fe2a61ed16734f15ba2abc95/src/main/java/com/fasterxml/jackson/databind/ser/BeanPropertyWriter.java#L688 ...which is not technically possible because:
    1. That arg came from ObjectMapper.writeValueAsString, i.e. what the caller passed to Jackson to serialize.
    2. When that arg is passed from ObjectMapper.writeValueAsString all the way to BeanPropertyWriter.serializeAsField, none of the methods the arg was passed to ever re-assign it.
    3. If somehow it DID happen (it didn't), it would've SUCCESSFULLY serialize as null or an NPE would've been thrown much, much earlier.

Suspicious

Here are the things I found suspicious:

  1. The reporter-provided stack trace says:
    Caused by: com.fasterxml.jackson.databind.JsonMappingException: (was java.lang.NullPointerException) (through reference chain: com.JavaClass["isStatus"])
    ...but when using the provided class to reproduce the issue, I see this:
    com.fasterxml.jackson.databind.JsonMappingException: (was java.lang.NullPointerException) (through reference chain: <..snip>TestBean["status"])
    ...so how did they get isStatus as property name?
  2. I checked the latest OpenJDK's DelegatingMethodAccessorImpl (and the previous versions) and saw that it only contains 49 line-of-code, but we see this in the reporter-provided stack trace:
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55) ~[?:1.8.0]

Conclusion (current)

Ignoring the suspicious isStatus property name, my current hypothesis is:

  1. The reporter's company is using a specially patched/hotfixed JDK/JRE...
  2. ...which has generated buggy GeneratedMethodAccessorXXX class.

yihtserns avatar Oct 12 '24 11:10 yihtserns

Very interesting analysis @yihtserns !

cowtowncoder avatar Oct 13 '24 04:10 cowtowncoder

Nothing actionable here unfortunately, will close.

cowtowncoder avatar Dec 07 '24 18:12 cowtowncoder