fast-json-stringify icon indicating copy to clipboard operation
fast-json-stringify copied to clipboard

Add support for ES6 Set

Open TommyDew42 opened this issue 3 years ago • 28 comments

What this PR does

An issue from #301.

We add support for Set in this PR only. A new Set([1, 2, 3, 4]) will be stringified to [1,2,3,4]

Unit tests:

  • a set of one type, for each primitive type, bigint & date-time.
  • a set of mixed types
  • an empty set
  • a nested set
  • a set as a constant
  • an object with toJSON and a set
  • if-then-else
  • array keyword like maxItems, contains, minContains

PR for Map

Working on Map now. Instead of having map and set feature in one PR, we have a separate PR for Map later?

Checklist

TommyDew42 avatar Sep 13 '22 16:09 TommyDew42

Please make sure that validation works with Map, Set. By validation I mean cases when we've got type property as an array, oneOf, anyOf, if/then/else schema.

ivan-tymoshenko avatar Sep 13 '22 16:09 ivan-tymoshenko

Thanks! Let me add unit tests for them tmr! @ivan-tymoshenko

TommyDew42 avatar Sep 13 '22 16:09 TommyDew42

Thanks! Let me add unit tests for them tmr! @ivan-tymoshenko

It wouldn't be just test cases. You will have to make Ajv works with Map and Set as an 'object' type. If you have questions ask me I will tell you where to loot at. But it's not so easy to add support of two new types. I'm not sure if it worth it.

ivan-tymoshenko avatar Sep 13 '22 16:09 ivan-tymoshenko

🤔 I see... we have to update build-schema-validator.js. How does the validation works in a high level understanding?

TommyDew42 avatar Sep 13 '22 16:09 TommyDew42

You need to make this test works.

test('should stringify Set as an array', (t) => {
  t.plan(2)

  const schema = {
    anyOf: [
      { type: 'array', items: { type: 'string' } },
      { type: 'object', properties: { foo: { type: 'string' } } }
    ]
  }

  const stringify = build(schema)
  t.equal(stringify(new Set(['foo', 'bar'])), '["foo","bar"]')
  t.equal(stringify({ foo: 'bar' }), '{"foo":"bar"}')
})

ivan-tymoshenko avatar Sep 13 '22 16:09 ivan-tymoshenko

I'm working on the validation and have a new idea. Maybe it's easier to change the type in the schema from array to set:

const schema = {
  title: 'a set of booleans',
  type: 'array',
  items: { type: 'boolean' }
},
const data = new Set([false, true]),

So we can internally convert the type in the validator to object. The conversion is similar to what we have been doing with fjs_type here.

Of course we can also have the conversion if we keep the type as what we are doing in this PR: the type is simply array. But the conversion would be to ['array', 'object'] since we don't know if it would be a set or array when the type being array.

TommyDew42 avatar Sep 17 '22 20:09 TommyDew42

@ivan-tymoshenko @mcollina @Eomm Wanna get you guys thoughts on using set as the type declaration on the schema instead of array

TommyDew42 avatar Sep 19 '22 16:09 TommyDew42

Going to implement this with set instead of array as the type declaration in the schema if no objection from you guys. @ivan-tymoshenko @mcollina @Eomm

TommyDew42 avatar Sep 24 '22 17:09 TommyDew42

Going to implement this with set instead of array as the type declaration in the schema if no objection from you guys.

@ivan-tymoshenko @mcollina @Eomm

I haven’t checked it yet. I will have some time tomorrow. We can discuss it.

ivan-tymoshenko avatar Sep 24 '22 17:09 ivan-tymoshenko

If i understand this correctly you want to implement a new type. Something like

{ 
  type: "set"
}

instead of

{
  type: "array"
}

Tbh i dont agree. type: "set" is not a valid json schema 7 type. The whole scope of this ES6 support is to implicitly serialize an ES6 Set as an Array. Keep in mind there is no way to deserialize a jsonstring to a Set. So there is information loss. But for that cases you would define a reviver for JSON.parse, which would revive the Set from the Array.

So again: I totally disagree with adding a non-standard json-schema type.

Uzlopak avatar Sep 24 '22 18:09 Uzlopak

So again: I totally disagree with adding a non-standard json-schema type.

Yes, I'm with you.

This would break everything:

  • swagger
  • open api
  • fluent-schema and every JSON schema builder

It would not add any value to the user

Eomm avatar Sep 25 '22 07:09 Eomm

Thanks for reminding me of json schema 7 type! As what Ivan said, it's not so easy to add support of two new types validation. The tricky thing is that the validator treats Set as an object while we want it to act like an array.

I take a look at json schema 7 type and noticed there's an uniqueItems keyword. I don't think we are using the keyword:

const stringify = build({ type: 'array', uniqueItems: true })
console.log(stringify([1, 1, 2])) // "[1,1,2]"

Would using { type: 'array', uniqueItems: true } to indicate Set be a better idea?

TommyDew42 avatar Sep 25 '22 16:09 TommyDew42

What is the purpose to "indicate" a Set when uniqueItems is set to true? What is there to indicate?

Uzlopak avatar Sep 25 '22 21:09 Uzlopak

the uniqueItems option has been added in Draf 8 version of the JSON Schema. So we can't relay on it.

I thought the issue here was:

When a user wants to serialize a Set in a type: Array field, the Set will be serialized as a primitive Array.

So, I don't get the point where the user must declare that it wants to support the Set. It should be a transparent feature/fix to me supporting the Set object.

Am I wrong?

Eomm avatar Sep 26 '22 06:09 Eomm

Thanks for @Uzlopak and @Eomm's input! The reason that I want the user to indicate they are stringifying a set is simple. I was struggling with the set validation and the indication helps with the implementation 😿.

But luckily I realised we can just follow how we are dealing with fjs_type for set validation. Two tests for anyOf and oneOf was added in 7276091b57828b715ec56710aceec50d1f64bcd7

There's one limitation tho. The if-else-then doesn't work on Set:

t.test('if/else with set', (t) => {
  t.plan(2)

  const schema = {
    type: 'array',
    if: { type: 'array', maxItems: 1 },
    then: { items: { type: 'string' } },
    else: { items: { type: 'number' } }
  }

  const stringify = build(schema)

  t.equal(stringify(new Set(['1'])), JSON.stringify(['1']))
  t.equal(stringify(new Set(['1', '2'])), JSON.stringify([1, 2])) // fail and stringified to '["1","2"]'
})

I am not sure the simple way to solve this... 🥵 Shall we simply write down the limitation?

TommyDew42 avatar Sep 27 '22 16:09 TommyDew42

@ivan-tymoshenko We need your skills here

Uzlopak avatar Sep 27 '22 17:09 Uzlopak

You need one more check

diff --git a/index.js b/index.js
index 8939e92..275e9ab 100644
--- a/index.js
+++ b/index.js
@@ -475,6 +475,9 @@ function addIfThenElse (location, input) {
   elseLocation.schema = merge(schema, elseSchema)
 
   return `
+    if (${input} instanceof Set) {
+      ${input} = Array.from(${input})
+    }
     if (validator.validate("${ifSchemaRef}", ${input})) {
       ${buildValue(thenLocation, input)}
     } else {

Eomm avatar Sep 27 '22 19:09 Eomm

Commit a400d84010a94288abe0f3397032ea2ab125fa64 will allow us to do all array validations on Set. Unit tests in 8c1d990c9e52e59153495d89933c663f43bda09b.

But there's one more limitation tho. It doesn't work when both object and array are in schema type: { type: ['object', 'array'] }

TommyDew42 avatar Sep 28 '22 16:09 TommyDew42

But I found a potential bug if I understand if-then-else logic correctly. It seems that error is because the anyof validator.validate receive { a:[] } instead of the object after modified with const value

test('anyOf with string and array', (t) => {
  t.plan(2)

  const schema = {
    anyOf: [
      { type: 'string' },
      {
        type: 'object',
        properties: {
          a: {
            type: 'array',
            if: { items: { type: 'number' }, minItems: 1 },
            then: { items: { type: 'number' } },
            else: { const: ['const item'] }
          }
        }
      }
    ]
  }

  const stringify = build(schema)

  t.equal(stringify('foo'), '"foo"') // ok
  t.equal(stringify({ a: [1, 2, 3] }), JSON.stringify({ a: [1, 2, 3] })) // ok
  t.equal(
    stringify({ a: [] }),
    JSON.stringify({ a: ['const item'] })
  ) // throw error: The value {"a":[]} does not match schema definition.
})

TommyDew42 avatar Sep 28 '22 16:09 TommyDew42

Shouldnt we just convert the Set to an Array and then the normal array validation takes over?

I have the feeling the current implementation is overcomplicated and does not what it wants to achieve. E.g. why do i need a keyword isSet (despite it should be named fjs_set or so) if I convert everything to an array anyway. Ajv can not iterate over the Set anyway.

Or what do you think?

Uzlopak avatar Sep 29 '22 08:09 Uzlopak

Do you mean turning all the set in the object passed in? That’s right… this should be a way simpler solution. Would it have performance issue? We are traversing the entire object.

I can try that.

TommyDew42 avatar Sep 29 '22 11:09 TommyDew42

if I convert everything to an array anyway

That is needed because we use ajv to apply the oneOf and anyOf validations. So that code is needed by ajv.

The other statements are needed to manage the code branches we generate through the building phase of the serialization function.

I have the feeling the current implementation is overcomplicated

The solution is not generic, but it is complete and simple. Unfortunately, we have many use cases (const, any/oneOf, plain schemas without root object)

I'm OK with this approach. Other refactors should be on us IMHO

Eomm avatar Sep 29 '22 19:09 Eomm

The solution is not generic, but it is complete and simple. Unfortunately, we have many use cases (const, any/oneOf, plain schemas without root object)

I didn't get this point. At the beginning of this issue, I said that adding a new unsupported type wouldn't be easy. I'm not ok with adding only the serialization part without validation.

ivan-tymoshenko avatar Sep 29 '22 19:09 ivan-tymoshenko

Do you mean turning all the set in the object passed in? That’s right… this should be a way simpler solution. Would it have performance issue? We are traversing the entire object.

I can try that.

If you want to make a simple solution, it will work. Of course, it will have some performance drop. Plus, remember that you shouldn't modify the original object. This means you will have to clone some part of it, which also takes some time.

ivan-tymoshenko avatar Sep 29 '22 19:09 ivan-tymoshenko

If you want to iterate over the Set etc.. I am also fine. Then implementing it needs to be a full solution and not something which works only partially.

Uzlopak avatar Sep 29 '22 20:09 Uzlopak

I guess what we are still missing in this PR is just Set validation with local reference. We have had all other features. Shall we try to get the validation with local reference first?

It seems to me turning all Set to array brings overhead to all cases where Set is not involved. And this might not be ideal if people don't have a frequent need to stringify Set. Maybe it's better to let people do the conversion from Set to array instead (?) But I'm not 100% sure about the overhead.

But feel free to let me know what your thoughts are. Even if you think the whole PR is unnecessary.

TommyDew42 avatar Oct 01 '22 02:10 TommyDew42

remember that you shouldn't modify the original object

Instead of replacing the value, we could add a symbol with the Array to access later

Eomm avatar Oct 01 '22 07:10 Eomm

Instead of replacing the value, we could add a symbol with the Array to access later

You still will have to make some checks during serialization/validation. If I understand your idea correctly, it won't help.

ivan-tymoshenko avatar Oct 01 '22 12:10 ivan-tymoshenko

I encountered some problems about cloning the object prototype methods when I tried to implement the convert-set-to-array solution. This seems to be a limitation/feature of our clone library (See davidmarkclements/rfdc#29).

I didn't expect this issue would be this complicated. Would like to take a break from it first. Feel free to tackle this when you are interested! 😸 Really appreciate the help from Ivan, Uzlopak & Manuel.

TommyDew42 avatar Oct 07 '22 15:10 TommyDew42