ebean icon indicating copy to clipboard operation
ebean copied to clipboard

Better length handling for JSON fields required.

Open rPraml opened this issue 1 year ago • 0 comments

I have to go back a bit here as to why I think the extension in Ebean makes sense.

Expected behavior

  • What ebean writes, it should also be able to read again

Actual behavior

  • With the update to Jackson 2.15.0 the following code fails:

Steps to reproduce

    String s = new String(new char[5_000_001]).replace('\0', 'x');

    EBasicJsonMap bean = new EBasicJsonMap();
    bean.setName("b1");
    bean.setContent(Map.of("string", s));
    DB.save(bean); // Expected: Throw an error that data is too long here.

    DB.find(EBasicJsonMap.class, bean.getId()); // Actual: throws LoadError, because String is too long

See Discussion https://github.com/FasterXML/jackson-core/issues/863 and https://github.com/FasterXML/jackson-core/issues/1014 Long story short: Jackson disallows the deserialization of strings > 5M (2.15.1 >20M)

The bug/change in jackson crashes our application All in all, writing so long strings in JSON is a) a bad idea b) was caused in our situation by a programming error, that has written too big JSON string.

We have used @DbJson for now without specifying a length attribute, which means, that this defaults effectively to 2G on most platforms - we've learned our lession, that this might bring us further trouble, if some code produces too big JSONs, so we tend to to specify JSON lengths to these properties.

Unfortunately, this causes further issues:

  • while @DbJson(length = 5000) works fine on H2, it fails on SqlServer when creating the column as nvarchar(5000) with The size (5000) given to the parameter 'content' exceeds the maximum allowed (4000)
  • Similar limits for other DBMS (MariaDB supports up to 64K bytes per row. Depending on the charset etc. this will eat up more bytes)
  • unfortunately, we often have very big limits (16K or even 1M for complex configuration documents), so varchar is not always the best option and we need to fall back to "clob"

What I want to change now is mainly in this PR #3102, but I think I have to break up this in several things

  • When a certain length is exceeded (e.g. 8K) always use CLOB: https://github.com/ebean-orm/ebean/pull/3102/files#diff-a0869c5d693887c73e19540c58ffa503eb8fa6584bbad05c1ef0e9d5c363626cR163
  • Handle "specials" for SqlServer nvarchar (use 'max' if value > 4000)
  • Support an "active" length check by ebean, before data is written to the DB (to avoid to write too long strings to clob fileds)

@rbygrave I know, the causing issue is mainly jackson related, but it shows up, that length checks for JSON is quite limited in ebean (at least if you have to deal with JSON lengths > 4000), so it would be great, if I get some feedback here.

rPraml avatar Jun 13 '23 06:06 rPraml