uima-uimaj icon indicating copy to clipboard operation
uima-uimaj copied to clipboard

Builders for engines and readers

Open reckart opened this issue 1 year ago • 0 comments

Is your feature request related to a problem? Please describe. We have the createEngineDescription and friends methods in uimaFIT. However, their parameters can be a bit confusing. For simple cases, we have a class and then the parameter/value combinations as pairs. However, if we want to add in a type system, type priorities or other stuff, it either becomes a bit fragile to not accidentally intermix those with the parameters or it is plain to possible because no createEngineDescription signature with the respective item exists.

Describe the solution you'd like It would be nice to have a builder which would allow stuff like this:

var engineDescription = AnalysisEngineDescription.builder(MyAnalsisEngine.class) //
    .withTypeSystem(TypeSystemDescriptionFactory.createTypeSystemDescription()) // can probably be omitted in most cases
    .withParameter(MyAnalsisEngine.PARAM_BLAH, "blub") // single parameter
    .withParameters( // multiples as pairs because it is convenient to not have to repeat "withParameter" all the time
         MyAnalsisEngine.PARAM_FOO, "foo", //
         MyAnalsisEngine.PARAM_BAR, "bar")
    .withTypePriorities(...) //
    .build();

Describe alternatives you've considered Instead of a normal builder pattern, a customizer pattern could also be used. That might make working with nested elements in the description more convenient. E.g.

var engineDescription = AnalysisEngineDescription.builder(MyAnalsisEngine.class)
    .metadata(md -> md
        .name("My Analysis Engine")
        .vendor("ACME")
        .typeSystem(TypeSystemDescriptionFactory.createTypeSystemDescription()))
    .parameters(params -> params
        .set(MyAnalsisEngine.PARAM_FOO, "foo")
        .set(MyAnalsisEngine.PARAM_BAR, "bar")))
    .build();

Additional context Important: the new approach should not auto-scan for type system descriptions or similar metadata. Scanning can be slow in certain environments and doing that for every analysis engine etc. is not necesssary. If a CAS needs to be created with a scanned type system, CasFactory.createCas() should be used. It is sufficient if the CAS knows the type system. It is should not be necessary for each and every component to know it (unless you build a pipeline from a bunch of components that each come with their own local partial type system which then needs to be merged into the pipeline system).

reckart avatar Dec 12 '24 11:12 reckart