flow-framework
flow-framework copied to clipboard
[REFACTOR] Eliminate duplication of required input / output List/Set
Is your feature request related to a problem?
The WorkflowValidatorTests class contains a lot of repetitive code that could be refactored to be more general.
This test class was originally written to validate the JSON file. Following #523 we now have an Enum and easier access to programmatically perform these tests, so it can be simplified.
What solution would you like?
The manual creation of a map here can be reduced to a single statement, which I've done as part of #530 .
The Input/Output size checks here could be improved.
Taking one entry as an example:
assertTrue(validator.getWorkflowStepValidators().keySet().contains("create_connector"));
assertEquals(7, validator.getWorkflowStepValidators().get("create_connector").getInputs().size());
assertEquals(1, validator.getWorkflowStepValidators().get("create_connector").getOutputs().size());
- The repeated string "create_connector" corresponds to the class name and should be replaced by
CreateConnectorStep.NAME. - There's no need to check the presence of the key in the keyset, the next line will throw an NPE when
get()returns null and we try to get inputs or outputs. - We should refactor to avoid duplication of the required inputs. Currently each workflow step defines its required inputs inline in the code (as a
Set<String>) and separately in theWorkflowStepValidatorclass (as aList<String>). This duplication should be removed by making the collection a class field with a getter similarly to howNAMEis declared. - We don't currently define the outputs in the
WorkflowStepimplementation classes, but we should do so to keep that declaration closer to where we actually populate theWorkflowDatawhen the step is complete.
What alternatives have you considered?
Leave the code as-is and keep adding to it.
Do you have any additional context?
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
@dbwiddis can i pick this issue?
@dbwiddis can i pick this issue?
i do see other strings are also repeated! and can be as well refactored.
Hey @ajay83 go for it!
@dbwiddis can you kindly elaborate more about point 3 , as per point 3 getInputs currently returns List.copyOf(inputs) and this should be class field as NAME? same for the output.
@dbwiddis can you kindly elaborate more about point 3 , as per point 3 getInputs currently returns List.copyOf(inputs) and this should be class field as NAME? same for the output.
The NAME field is an example: a public static field.
Currently we just do a temporary assignment of required inputs / keys with this assignment (and similar in every step): https://github.com/opensearch-project/flow-framework/blob/24bf51a69f413a0d6e43c3a9d9c8912e9890bd26/src/main/java/org/opensearch/flowframework/workflow/CreateConnectorStep.java#L124-L132
Elsewhere when we validate the template we are independently creating a list of exactly the same inputs: https://github.com/opensearch-project/flow-framework/blob/24bf51a69f413a0d6e43c3a9d9c8912e9890bd26/src/main/java/org/opensearch/flowframework/workflow/WorkflowStepFactory.java#L115-L117
My suggestion is to change the requiredKeys assignment to a (public static) class variable for the step, and access it from the WorkflowSteps enum instead of recreating it.
Point 4 would do the same thing for the outputs. They are not currently defined in each workflow step, but we should define them there, as the eventual assignment happens there and it's easier to make sure they are in alignment (we had a recent issue where we discovered a mismatch much later.) This may be a bit more complicated in the cases that we look it up from the resource enum. https://github.com/opensearch-project/flow-framework/blob/24bf51a69f413a0d6e43c3a9d9c8912e9890bd26/src/main/java/org/opensearch/flowframework/workflow/CreateConnectorStep.java#L98
Hey @ajayl83 how's it going with this issue? Need any help?
Hey @ajayl83 I'm going to unassign this to let someone else work on it. If you are around and want to re-claim it just let us know.