go-xml
go-xml copied to clipboard
xsdgen: type flattening routines are slow and incorrect
The xsdgen package flattens the XSD type hierarchy to produce nicer code.
a → b→ c → anyType
becomes
a → anyType
b → anyType
c → anyType
Practically speaking, instead of generating code like this:
type MyString string
type AllCapitalString MyString
type ShortAllCapitalString AllCapitalString
It generates code like this:
type MyString string
type AllCapitalString string
type ShortAllCapitalString string
This is done in the Config.flatten method. There are so many problems with this function, I don't know where to start.
- It's slow -- it repeatedly recurses into the type hierarchy, even if all types have similar ancestry.
- It's expensive -- it builds up a big slice of xsd.Type items that usually contains lots of duplicates
- It's complicated -- not only does it flatten types, it opportunistically elides types that it deems useless, filters types not in the whitelist, unpacks struct types when it can, and who knows what else.
- It's wrong -- see in #14 where it does not correctly collect all types needed for a given subset of types.
The cfg.flatten and cfg.flatten1 functions need a redo, with the following goals:
- Do one transformation at a time, rather than all at once.
- Use memoization to skip flattening type hierarchies we've seen before.
- If necessary, extend and use the
internal/dependencypackage'sGraphtype to model dependencies between derived types and the types of their attributes, elements and ancestors, to avoid missing indirectly required types.