jackson-dataformats-binary icon indicating copy to clipboard operation
jackson-dataformats-binary copied to clipboard

streamReadConstaints support in Ion dataformat

Open pjfanning opened this issue 1 year ago • 6 comments

  • I was looking for cases where TextBuffers are created and to changeover to the ReadConstrainedTextBuffer (as part of https://github.com/FasterXML/jackson-dataformats-binary/pull/357)
  • Ion code base only uses a TextBuffer in deprecated code but the javadocs provide no hint of what new code to use and if the deprecated code is planned for removal In IonFactory, there is
    @Deprecated
    protected String _readAll(Reader r, IOContext ctxt) throws IOException
    {
        // Let's use Jackson's efficient aggregators... better than JDK defaults
        TextBuffer tb = ctxt.constructTextBuffer();
  • so do we need to update this code, or ignore it?
  • I wrote a test nonetheless. I know nothing about this Ion code but I copied an existing test to see if I could make it fail by setting a low maxStringSize. It does not fail.
    @Test
    public void testSimpleLowStringLimit() throws IOException {
        Bean original = new Bean("parent_field", new ChildBean("child_field"));
        IonFactory ionFactory = new IonFactoryBuilder(false)
                .streamReadConstraints(StreamReadConstraints.builder().maxStringLength(1).build())
                .build();
        IonObjectMapper mapper = new IonObjectMapper(ionFactory);
        mapper.registerModule(new IonAnnotationModule());
        String serialized = mapper.writeValueAsString(original);
        //the next call should fail because the streamReadConstraints have such a small maxStringLength
        Bean deserialized = mapper.readValue(serialized, Bean.class);

        assertEquals(original.field, deserialized.field);
        assertEquals(original.child.someField, deserialized.child.someField);
    }
  • IonParser has its own code for getText() and theory that can be changed to validate the string lengths.
  • I know nothing about Ion but it looks an IonReader provided by an AWS lib does the real parsing - so we may only be check the string len after that call (with the risk the string is huge already)
  • maybe there is a way to get IonReader to check (which would be better because it could catch the issue earlier)

pjfanning avatar Feb 16 '23 22:02 pjfanning

Right, Ion codec used under the hood is from Amazon's library, and that would need to co-operate here.

Ion maintainers may be able to help here: until then, maybe we should add test under .../failing and leave this issue open? Altough I technically wrote version 0.5-alpha of this format backend, as a proof-of-concept (... and it was eventually contributed to Jackson project couple of years after I had left Amazon :-D ), I am not familiar enough with it at this point to be able to make changes. And possibly unable to if support at library level was needed (I wasn't involved with actual Ion codec development).

cowtowncoder avatar Feb 17 '23 03:02 cowtowncoder

@cowtowncoder is there someone in Amazon Ion team that could be approached?

It is probably safe to say that none of the StreamReadConstraints stuff works for this format right now (including the num size limits) - or IonReader could have something of its own.

pjfanning avatar Feb 17 '23 13:02 pjfanning

Right, for now Ion doesn't get constraints, beyond what Amzn codec might offer out of the box.

Yes, there are some maintainers

  • release-notes/VERSION-2.x mentions @tgregg
  • README.md lists @mcliedtke

I should probably figure if one of them is considered active (or perhaps someone else, I recall seeing other accounts too) and then update documentation.

cowtowncoder avatar Feb 17 '23 22:02 cowtowncoder

I've opened the above-linked feature request within the ion-java repository. This functionality isn't currently provided, but the use case is pretty easy to understand, so we (the Ion maintainers) should consider it.

tgregg avatar Feb 17 '23 22:02 tgregg

That's good, and I think there could be two ways to configure limits: native Ion (when providing low-level components to Jackson factory), or, if limit-options match, using StreamReadConstraints to configure Ion side.

cowtowncoder avatar Feb 17 '23 22:02 cowtowncoder

In jackson 2.16 (when released), there will also be a limit on the size of the object names. 50,000 chars or bytes depending on the input context.

pjfanning avatar Aug 29 '23 08:08 pjfanning