doobie Circe throws an exception when parsing nested JSONB field that is null.

If a query returns a nested structure from a JSON(b) field, the empty field must be absent for the query to work. If the field is present and null, there is a circe exception thrown.

Example query:

sql"SELECT address->'coordinates' FROM profile"

Definition of the Coordinates and Address types:

case class Coordinates(latitude: Double, longitude: Double)
case class Address(coordinates: Option[Coordinates])

Definition of the custom mapping:

given Meta[Coordinates] = new Meta(pgDecoderGet, pgEncoderPut)
given Meta[Address] = new Meta(pgDecoderGet, pgEncoderPut)

The query works for the following contents of the address column: NULL {}

The query throws and exception for the following contents of the address column: {"coordinates":null} The exception message is: io.circe.DecodingFailure$DecodingFailureImpl: DecodingFailure at : Got value 'null' with wrong type, expecting object

Oct 28 '24 08:10 slavaschmidt

In Postgres a JSON null and SQL null are different, so you probably need to cast it to a SQL null in the case of {"coordinates":null}. I think using ->> instead of -> may be what you need?

https://mbork.pl/2020-02-15_PostgreSQL_and_null_values_in_jsonb

If it still doesn't work for you, can you help by provided a runnable minimal reproduction of the issue?

Oct 28 '24 11:10 jatcwang

Thank you for your suggestions, @jatcwang.

Yep, I'm aware of the differences in arrows.

In this simple case, the ->> would solve the issue for null, but it will create an issue for not null. For example, in the case of nested objects like the following: {"address": {"coordinates":{"latitude": 10, "longitude":10}}} query address->>'coordinates' will fail to convert to Coordinates. The reason is that the ->> converts the result to TEXT, so we won't have JSON anymore, leading to different kind of exception.

Oct 28 '24 11:10 slavaschmidt

If it still doesn't work for you, can you help by provided a runnable minimal reproduction of the issue?

Here we go

doobie-issue.zip

Oct 28 '24 12:10 slavaschmidt

Thanks @slavaschmidt can you try something like profile->>'address'->>'coordinates' :: jsonb? That way doobie only sees a not-null JSONB or SQL NULL.

Ideally the circe integration will be able to handle top level JSON nulls like your examples, but will need to think about it more because Read[Option[A]] does not / should not know anything about the underlying implementation of Read[A].

Oct 28 '24 12:10 jatcwang

Thanks @slavaschmidt can you try something like profile->>'address'->>'coordinates' :: jsonb? That way, doobie only sees a not-null JSONB or SQL NULL.

Thanks @jatcwang, unfortunately, this won't work. The first double-arrow returns TEXT, so the next double-arrow can't be applied. I checked it to be sure and the exception is: org.postgresql.util.PSQLException: ERROR: operator does not exist: text ->> unknown

Oct 28 '24 13:10 slavaschmidt

Ah I see sorry. Using #>> operator seems to give the expected answer for all 4 of your examples

e.g. I replaced line 53 in your example with brokenNull <- sql"SELECT (profile #>> '{address,coordinates}') :: jsonb FROM customer".query[Option[Coordinates]].to[List]

and it yields List(Some(Coordinates(Some(1),Some(2))), None, None, None)

Oct 28 '24 14:10 jatcwang

The workaround works. Thank you very much for your help! Still, I believe it would be cool if JSON null would be interpreted in the same way as SQL NULL as an empty Option and not throw an exception.

Oct 28 '24 15:10 slavaschmidt

@slavaschmidt ,

Another quick workaround that allows using

SELECT profile->'address'->'coordinates'

is to split the data processing into two steps: query it as Option[Json] first, then decode it as Option[Coordinates] afterwards:

sql"SELECT profile->'address'->'coordinates' FROM customer WHERE id = <any ID from the example>"
  .query[Option[Json]]
  .unique
  .flatMap(_.flatTraverse {
    _.as[Option[Coordinates]].liftTo[ConnectionIO]
  })

This emits correct results regardless of whether the query returns SQL NULL or JSON NULL.

Oct 29 '24 02:10 satorg

Another quick workaround

Hi @satorg, thanks for the hint! This is precisely where I'm coming from.

The issue is that with dozens of tables and lots of JSON columns, there are currently hundreds of LOCs dedicated to this manual parsing approach in our code base. Not only that. When a query returns multiple JSON columns, one must either work with tuples, which affects readability or create two different case classes, one with JSON and another with correctly typed fields.

Using custom mappings elegantly solves both of these deficiencies.

Oct 29 '24 05:10 slavaschmidt

@jatcwang , the above workaround can be converted into a more generic solution.

In order to accomplish that we need a constructor for Read instead of Get straight out of Decoder, i.e. something like the following:

def pgDecoderReadOption[A: Decoder]: Read[Option[A]] = {
  val getJson = Get[Json]
  new Read(
    // To be honest, I cannot understand why do we need to store `Get` in `Read`.
    // Doobie does it everywhere but doesn't seem using the stored value.
    List((getJson, Nullability.Nullable)),
    { (rs, n) =>
      getJson
        .unsafeGetNullable(rs, n)
        .flatTraverse(_.as[Option[A]])
        // Unfortunately, cannot avoid `throw` from `Read` ¯\_(ツ)_/¯
        .valueOr(throw _)
    }
  )
}

Having such a constructor, a user can create their our type specific Read implicit:

implicit val coordOptRead: Read[Option[Coordinates]] = pgDecoderReadOption[Coordinates]

And now the original query from the example works for any id as well:

sql"SELECT profile->'address'->'coordinates' FROM customer WHERE id = <any ID from the example>"
  .query[Option[Coordinates]]
  .unique

Unfortunately, we cannot make pgDecoderReadOption implicit itself – it would create a lot of type collisions if we did.

Oct 29 '24 05:10 satorg