ts-json-schema-generator
ts-json-schema-generator copied to clipboard
Multiple Instances of a type name
Past issues indicate this is a recurring problem for others (and I!).
Any fix will have to be a bit intrusive and involves some opinionated design choices. Hopefully this issue will serve to discuss possible solutions.
Occurrence of "multiple instances" in a schema
Multiple instances in the schema occur when the exported name of a type is identical. This can happen when the same TypeName is exported from two different files. There are three cases of this in valid TypeScript programs that I have encountered:
Case 1: Chained inheritance:
// base.ts
export interface MyObject {
a: string;
}
// intermediate.ts
import * as Base from "./Base";
export interface MyObject extends Base.MyObject {
b: string;
}
// main.ts
import * as Intermediate from "./Intermediate";
export interface MyObject extends Intermediate.MyObject {}
Case 2: Composition:
// ComponentA.ts
export interface MyObject {
a: string;
}
// ComponentB.ts
export interface MyObject {
b: string;
}
// main.ts
import * as A from "./componentA";
import * as B from "./componentB";
export interface MyObject {
a: A.MyObject;
b: B.MyObject;
}
Case 3:
The duplicates test case. This is in principle the same as case 2, but a different example, so worth testing for.
Root cause
These are all valid TypeScript programs, that should have valid generatable JSON-schemas, but our favorite schema-generator barfs. The best I can understand, it's because the generator stores the Type as the "name" in the file it is defined, and loses the context of the file path. Within the TypeScript AST these are independent nodes, bound to a sourceFile, allowing for disambiguation when necessary. Since the Type constructors do not store the node, we lose this ability at the point of generation.
Importantly, we only need the "fully qualified name" in case of a conflict. The "simple name" should suffice in the vast majority of case.
Possible Solution:
(References the POC implementation)
- Use
getId()instead ofgetName()to generate all references initially - DefinitionTypeFormatter & ReferenceTypeFormatter - Build a schema using these, but also create an
idNameMap, which uses maps theidto it's unambigiousName - The
unambiguousNameis identical togetName()when there is no conflict, and uses the smallest possible prefix computed from sourceFileName deltas between all collisions. RootTypes grab thegetName(). - The schema is constructed as before, removing undefined and unreachable definitions. Once done, a
resolveIdRefsrecursive walk uses the idNameMap to fix the schema up.
(if this sounds complicated - a proof-of-concept PR is coming right behind the issue being filed)
Opinionated parts:
- Disambiguation segment: This should be the smallest possible string that allows for proper disambiguation and makes sense to the author/users of the TypeScript-code/schema. One option would be to consider the import path that would be needed. However, many a time, this will include an trailing
index.tswhich is superfluous for our purpose. Given conflicting names, I'd like to propose removing the common-prefixes and any trailingindex.tsto arrive at the disambiguation string. - Path separator: since the json-schema and all related tooling is built around the json-ptr, using a "/" will cause all kinds of down-stream trouble in using these schemas. I'd like to propose using
-which is URL safe, easy on the humans, and doesn't conflict with TypeScript variable naming conventions.
Examples:
- Case #1 from above, duplicates-inheritance yields
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/MyObject",
"definitions": {
"MyObject": {
"type": "object",
"required": [
"a",
"b",
"c"
],
"properties": {
"a": {
"type": "string"
},
"b": {
"type": "string"
},
"c": {
"type": "string"
}
},
"additionalProperties": false
}
}
}
- Case #2 from above, duplicates-composition yields
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/MyObject",
"definitions": {
"MyObject": {
"type": "object",
"required": [
"a",
"b"
],
"properties": {
"a": {
"$ref": "#/definitions/componentA-MyObject"
},
"b": {
"$ref": "#/definitions/componentB-MyObject"
}
},
"additionalProperties": false
},
"componentA-MyObject": {
"type": "object",
"required": [
"a"
],
"properties": {
"a": {
"type": "string"
}
},
"additionalProperties": false
},
"componentB-MyObject": {
"type": "object",
"required": [
"b"
],
"properties": {
"b": {
"type": "string"
}
},
"additionalProperties": false
}
}
}
- Case #3 duplicates yields:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$ref": "#/definitions/MyType",
"definitions": {
"MyType": {
"anyOf": [
{
"$ref": "#/definitions/import1-A"
},
{
"$ref": "#/definitions/import2-A"
}
]
},
"import1-A": {
"type": "number"
},
"import2-A": {
"type": "string"
}
}
}
Cons
- This likely has a slight performance hit - since we walk the schema one more time as a post process step. But that is not different than the walk performed by
removeUnreachable.
Pros
- We'll generate schemas for a larger subset of TypeScript programs.
- Since we bind to the filename, reuse of definitions should work irrespective of how they are aliased at point of use in the TypeScript files
Has there been any progress on something like this? The library is not usable for large projects because simple enums/aliases with common names like Data, Result, and other common strings are "taken".
Sometimes it's possible to consolidate, but not always.
What is required to make progress on this?
Saying it's unusable is not correct. I use it for Vega-Lite and mosaic. For the latter, the json schema is like 8mb so pretty huge.
But I agree that it would be nice to not choke on duplicates. Unfortunately, json schema has a flat namespace. So the only way I see would be to add some prefix/postfix to names when there are duplicates. I don't know exactly what that would entail and you'd need to dig into the code base yourself.
Yep, sorry for the hyperbole. It's unusable on my personal big repo :).
I will take a look at how I might add src information as a prefix.
Could be src but could also just be a counter if you are okay with implementing some kind of duplicate detection. Prefixes of paths will work without that will make the schema ugly (which may be okay).
I actually thought we already had something like the former. Maybe I misremembered from another project or it was for anonymous types.