efcore icon indicating copy to clipboard operation
efcore copied to clipboard

SQL Server: Support large JSON string properties (>4000)

Open roji opened this issue 3 years ago • 4 comments

Member access with JSON documents is currently translated using JSON_VALUE. That function returns nvarchar(4000), and doesn't support larger values (it either returns null or errors, depending on lax vs. strict mode). The documented workaround on that doc page is to use OPENJSON, so we may want to think about switching to that.

roji avatar Nov 02 '22 21:11 roji

Looked into using OPENJSON as suggested by the SQL Server documentation.

Given a table with a JSON column as follows:

CREATE TABLE [Blogs] (
    [Id] int NOT NULL IDENTITY,
    [Json] nvarchar(max)
);
INSERT INTO [Blogs] ([Json]) VALUES ('
{
  "track": {
    "segments": [
      {
        "name": "Segment 1",
        "location":   [ 47.763, 13.4034 ],
        "start time": "2018-10-14 10:05:14",
        "HR": 73
      },
      {
        "name": "Segment 2",
        "location":   [ 47.706, 13.2635 ],
        "start time": "2018-10-14 10:39:21",
        "HR": 135
      }
    ]
  }
}
');

The name of a specific segment can be projected out as a table as follows:

SELECT * FROM OPENJSON((SELECT [Json] FROM [Blogs]), '$.track.segments[1]')
WITH (Name nvarchar(max) '$.name');

This allows defining the exact expected type - including nvarchar(max). This can be used as a scalar subquery to retrieve any single property from a JSON document:

SELECT * FROM [Blogs] AS b
WHERE (SELECT * FROM OPENJSON(b.Json, '$.track.segments[1]') WITH (SegmentName nvarchar(max) '$.name')) = 'Segment 2';

I doubt the above expression is indexable, so we probably wouldn't want to completely switch to it from JSON_VALUE. One option would be to have a SQL Server-specific metadata opt-in ("IsLargeString"), which makes us switch to this as the technique for fetching the value. Or we could switch to OPENJSON by default, and only use JSON_VALUE when the property is indexed.

Note that multiple properties can be projected in the same OPENJSON call, including properties from nested sub-documents (so it's a flexible projection API).

roji avatar Nov 19 '22 18:11 roji

Note from triage: consider using the size facet to determine whether to use OpenJson or not. If max, use it, if not, don't.

ajcvickers avatar Nov 24 '22 10:11 ajcvickers

Turns out I mis-read the docs - JSON_VALUE doesn't truncate values over 4000, it returns null (in lax mode) or throws (in strict mode, which we're not using) - docs.

We have #30981 for possibly switching to OPENJSON instead of JSON_VALUE; if we don't do that, we may want to switch to strict mode.

roji avatar Jun 08 '23 17:06 roji

An improvement to SQL Server's JSON_VALUE() may be coming, where a RETURNING clause would allow specifying the returned type, with the conversion happening inside JSON_VALUE(). This would be both more efficient and resolve the above issue.

roji avatar Apr 27 '24 13:04 roji