flow-framework icon indicating copy to clipboard operation
flow-framework copied to clipboard

[REFACTOR] Eliminate duplication of required input / output List/Set

Open dbwiddis opened this issue 1 year ago • 7 comments

Is your feature request related to a problem?

The WorkflowValidatorTests class contains a lot of repetitive code that could be refactored to be more general.

This test class was originally written to validate the JSON file. Following #523 we now have an Enum and easier access to programmatically perform these tests, so it can be simplified.

What solution would you like?

The manual creation of a map here can be reduced to a single statement, which I've done as part of #530 .

The Input/Output size checks here could be improved.

Taking one entry as an example:

assertTrue(validator.getWorkflowStepValidators().keySet().contains("create_connector"));
assertEquals(7, validator.getWorkflowStepValidators().get("create_connector").getInputs().size());
assertEquals(1, validator.getWorkflowStepValidators().get("create_connector").getOutputs().size());
  1. The repeated string "create_connector" corresponds to the class name and should be replaced by CreateConnectorStep.NAME.
  2. There's no need to check the presence of the key in the keyset, the next line will throw an NPE when get() returns null and we try to get inputs or outputs.
  3. We should refactor to avoid duplication of the required inputs. Currently each workflow step defines its required inputs inline in the code (as a Set<String>) and separately in the WorkflowStepValidator class (as a List<String>). This duplication should be removed by making the collection a class field with a getter similarly to how NAME is declared.
  4. We don't currently define the outputs in the WorkflowStep implementation classes, but we should do so to keep that declaration closer to where we actually populate the WorkflowData when the step is complete.

What alternatives have you considered?

Leave the code as-is and keep adding to it.

Do you have any additional context?

https://en.wikipedia.org/wiki/Don%27t_repeat_yourself

dbwiddis avatar Feb 22 '24 00:02 dbwiddis

@dbwiddis can i pick this issue?

ajayl83 avatar Feb 22 '24 09:02 ajayl83

@dbwiddis can i pick this issue?

i do see other strings are also repeated! and can be as well refactored.

ajayl83 avatar Feb 22 '24 09:02 ajayl83

Hey @ajay83 go for it!

dbwiddis avatar Feb 22 '24 15:02 dbwiddis

@dbwiddis can you kindly elaborate more about point 3 , as per point 3 getInputs currently returns List.copyOf(inputs) and this should be class field as NAME? same for the output.

ajayl83 avatar Feb 22 '24 16:02 ajayl83

@dbwiddis can you kindly elaborate more about point 3 , as per point 3 getInputs currently returns List.copyOf(inputs) and this should be class field as NAME? same for the output.

The NAME field is an example: a public static field.

Currently we just do a temporary assignment of required inputs / keys with this assignment (and similar in every step): https://github.com/opensearch-project/flow-framework/blob/24bf51a69f413a0d6e43c3a9d9c8912e9890bd26/src/main/java/org/opensearch/flowframework/workflow/CreateConnectorStep.java#L124-L132

Elsewhere when we validate the template we are independently creating a list of exactly the same inputs: https://github.com/opensearch-project/flow-framework/blob/24bf51a69f413a0d6e43c3a9d9c8912e9890bd26/src/main/java/org/opensearch/flowframework/workflow/WorkflowStepFactory.java#L115-L117

My suggestion is to change the requiredKeys assignment to a (public static) class variable for the step, and access it from the WorkflowSteps enum instead of recreating it.

Point 4 would do the same thing for the outputs. They are not currently defined in each workflow step, but we should define them there, as the eventual assignment happens there and it's easier to make sure they are in alignment (we had a recent issue where we discovered a mismatch much later.) This may be a bit more complicated in the cases that we look it up from the resource enum. https://github.com/opensearch-project/flow-framework/blob/24bf51a69f413a0d6e43c3a9d9c8912e9890bd26/src/main/java/org/opensearch/flowframework/workflow/CreateConnectorStep.java#L98

dbwiddis avatar Feb 22 '24 16:02 dbwiddis

Hey @ajayl83 how's it going with this issue? Need any help?

dbwiddis avatar Mar 08 '24 03:03 dbwiddis

Hey @ajayl83 I'm going to unassign this to let someone else work on it. If you are around and want to re-claim it just let us know.

dbwiddis avatar Apr 22 '24 03:04 dbwiddis