kin-openapi
kin-openapi copied to clipboard
Recursive definitions causes the parser to run indefinitely
We are parsing a bunch of 3rd party openapi specs, not all are perfectly valid.. We came across the following example today:
"ReplicationSnapshotInfo": {
"type": "object",
"required": [
"snappableId",
"snapshotId"
],
.....
"childSnapshotInfos": {
"type": "array",
"description": "An array of child snapshots information.",
"items": {
"$ref": "#/definitions/ReplicationSnapshotInfo"
}
}
}
},
Where ReplicationSnapshotInfo
is defined in terms of itself.
(See the original file here https://rubrikinc.github.io/api-doc-internal-6.0/openapi.json, line 28681, to reproduce in context)
When childSnapshotInfos
was removed from this section, the parser works as expected, when its there, it goes into an infinite recursion and never halts.
I think it shouldn't be difficult for the parser to recognize such recursion by keeping a "visited" set, and checking that we are not recursing into something we already recursed in higher up the stack. Otherwise, as a plain user of this client, I don't know how else to avoid this, short of making a simplified parser that only looks for these loopy definitions..
Thing is that this runs on its own on many such files, so we can't manually inspect each (we couldn't even if we wanted to), and it just freezes our process without any way to recover.
This may be related to https://github.com/getkin/kin-openapi/issues/542
@shlomi-dr how would you expect this to be handled? An error returned that recursion was detected, so at least you can skip these (instead of having your process hang) or it to be handled in a particular way so you can actually work with the file?
Hi @TristanSpeakEasy, thanks for taking this up! What I did in my own little "fix" for this (https://github.com/getkin/kin-openapi/pull/571/files) is to just return an error, which is sufficient for my use-case.
If we want to be more "optimistic" about this, we could perhaps drop these "bad segments" and try to parse the rest of the file, but that may cause other troubles or inconsistencies further down the line, so perhaps this could be an option to the parser ("ErrorOnDeepRecursion" or something)
@shlomi-dr yeah in the PR you will see linked above I took the error approach for now as at list it gives you a way of being able to recover from the issue.
Technically there are some legitimate use cases for recursion like this and the complexity comes in trying to detect those and then determining how to deal with them but I think that is probably a problem for another time
Thanks @TristanSpeakEasy, checked out your PR and it looks great. Only note I may add is to optionally make the recursion depth a settable option with a default of 3.
@shlomi-dr Oh do you need to set this to a value greater than 3
and if yes can you share that openapi document?
@fenollp I don't know that I do right now. In my silly PR, I made a "test case" which process the original file that caused the problem, so as long as that same test passes with this PR we should be good (I can try that at some point).
I made the comment about adding that knob just out of experience, where processing random swaggers can really leads to odd failure cases, and being able to deal with them without the hurdle of recompiling, adding replace directives etc is just much easier.
I truly appreciate how caring and responsive you guys are, I use lots of different libraries for my projects, and only a handful really take the user's input into consideration as you do. Thank you!! ❤️
Thanks for the kind words! :)
as long as that same test passes
They do. Also #607 has your OpenAPI document but minimized (thanks @TristanSpeakEasy for that!).