dscr
dscr copied to clipboard
Revisit structure of output parsers
The way that output parsers are implemented via a Sys.glob
call that changes outputs to other outputs is quite complicated. The logic of the execution engine is very convoluted, and it may continue to block the implementation of parallelized run_output_parsers
or run_scores
as well as a correct/safe reset_output_parsers
.
It might be worth revisiting the goals and desired functionality of output parsers and to re-implement them in a more careful/rigorous way that gives the overall dsc workflow an execution path that is easier to introspect.
@stephens999 - one question about this: you mentioned that currently run_output_parsers
can implement a whole pipeline for a single method and that it might be okay to limit ourselves to a single output parser. Do you, or any users you know of, actually use multi-step output parsers currently?
Multiple, yes, but not building on one another sequentially. Ie wechave examples where output a gets parsed a to b and a to c, but not a to b to c.
Mostly it has been useful so you can simply dump the whole output of an expensive method, and worry about which bits you actually need later
Excellent, got it, thanks much. If we rule out the "a to b to c" case, we can have each method "own" a set of (zero or more) output parsers the way that each scenario "owns" a set of (one or more) seeds, even if those output parsers are configured in an indirect way (via your "type" system). I am currently trying to create a stricter output parser system where we can query the structure of the execution graph in advance (from the dsc
object) while maintaining identical or similar behavior on existing use-cases.