kraken icon indicating copy to clipboard operation
kraken copied to clipboard

Validate VGSL spec before loading training data

Open mittagessen opened this issue 2 years ago • 2 comments

Add a method to lib.vgsl.TorchVGSLModel to validate a (partial) spec, feeding a 'dummy' line/image into it depending on the input specification. This would allow aborting training before loading the training dataset when an invalid/unworkable spec is given to the KrakenTrainer constructors.

mittagessen avatar Oct 29 '21 14:10 mittagessen

Why should this require adding a method to lib.vgsl.TorchVGSLModel ? In the KrakenTrainer constructors immediately after adding the first element to the dataset it is possible to feed the dataset to the network inside a try...except and throw a specific error. That would require a couple of lines of code.

anutkk avatar Nov 01 '21 10:11 anutkk

A separate validator is preferable as it would allow tools that use the API like escriptorium to validate a user-provided spec. In the KrakenTrainer object we'd have to instantiate the model twice (once without the output layer to validate and once with it after having loaded the complete dataset so we can determine the codec alphabet). I'd like to avoid that as the constructors are already annoyingly large complex pieces of code and should really be slimmed down.

mittagessen avatar Nov 02 '21 10:11 mittagessen