go-xml icon indicating copy to clipboard operation
go-xml copied to clipboard

xsdgen: type flattening routines are slow and incorrect

Open droyo opened this issue 8 years ago • 0 comments

The xsdgen package flattens the XSD type hierarchy to produce nicer code.

a → b→ c → anyType

becomes

a → anyType
b → anyType
c → anyType

Practically speaking, instead of generating code like this:

type MyString string
type AllCapitalString MyString
type ShortAllCapitalString AllCapitalString

It generates code like this:

type MyString string
type AllCapitalString string
type ShortAllCapitalString string

This is done in the Config.flatten method. There are so many problems with this function, I don't know where to start.

  • It's slow -- it repeatedly recurses into the type hierarchy, even if all types have similar ancestry.
  • It's expensive -- it builds up a big slice of xsd.Type items that usually contains lots of duplicates
  • It's complicated -- not only does it flatten types, it opportunistically elides types that it deems useless, filters types not in the whitelist, unpacks struct types when it can, and who knows what else.
  • It's wrong -- see in #14 where it does not correctly collect all types needed for a given subset of types.

The cfg.flatten and cfg.flatten1 functions need a redo, with the following goals:

  • Do one transformation at a time, rather than all at once.
  • Use memoization to skip flattening type hierarchies we've seen before.
  • If necessary, extend and use the internal/dependency package's Graph type to model dependencies between derived types and the types of their attributes, elements and ancestors, to avoid missing indirectly required types.

droyo avatar Nov 16 '17 03:11 droyo