yavi icon indicating copy to clipboard operation
yavi copied to clipboard

failFast of nested validator not working

Open runeflobakk opened this issue 2 years ago • 12 comments

Hello,

I believe (given if I have understood the mechanics properly) that the failFast facility does not work if a validator with failFast==true is used in a nested way, e.g. either with .nest(..) or .forEach(..).

The following test demonstrates this:

record B(int x, String y) {}
record A(B b, List<B> bs) {}

Validator<B> bValidator = ValidatorBuilder.of(B.class)
    .constraint(B::x, "x", c -> c.greaterThan(0))
    .constraint(B::y, "y", c -> c.equalTo("y"))
    .failFast(true)
    .build();

var invalidB = new B(0, "not y");
var bViolations = bValidator.validate(invalidB).violations().stream().map(ConstraintViolation::messageKey).toList();
assertThat(bViolations, contains(NUMERIC_GREATER_THAN.messageKey()));


Validator<A> aValidator = ValidatorBuilder.of(A.class)
        .nest(A::b, "b", bValidator)
        .build();

var aWithInvalidBs = new A(invalidB, List.of(invalidB));
var aViolations = aValidator.validate(aWithInvalidBs).violations().stream().map(ConstraintViolation::messageKey).toList();
assertAll("should be only one violation, because greaterThan(0) fails and failFast==true",
        () -> assertThat(aViolations, not(hasItem(OBJECT_EQUAL_TO.messageKey()))),
        () -> assertThat(aViolations, contains(NUMERIC_GREATER_THAN.messageKey()))
    );
Complete JUnit 5 test source code
import am.ik.yavi.builder.ValidatorBuilder;
import am.ik.yavi.core.ConstraintViolation;
import am.ik.yavi.core.Validator;
import org.junit.jupiter.api.Test;
import java.util.List;
import static am.ik.yavi.core.ViolationMessage.Default.NUMERIC_GREATER_THAN;
import static am.ik.yavi.core.ViolationMessage.Default.OBJECT_EQUAL_TO;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.contains;
import static org.hamcrest.Matchers.hasItem;
import static org.hamcrest.Matchers.not;
import static org.junit.jupiter.api.Assertions.assertAll;

class NestedFailFastValidator {

@Test
void failFastOfNestedValidator() {
    record B(int x, String y) {}
    record A(B b, List<B> bs) {}

    Validator<B> bValidator = ValidatorBuilder.of(B.class)
        .constraint(B::x, "x", c -> c.greaterThan(0))
        .constraint(B::y, "y", c -> c.equalTo("y"))
        .failFast(true)
        .build();

    var invalidB = new B(0, "not y");
    var bViolations = bValidator.validate(invalidB).violations().stream().map(ConstraintViolation::messageKey).toList();
    assertThat(bViolations, contains(NUMERIC_GREATER_THAN.messageKey()));


    Validator<A> aValidator = ValidatorBuilder.of(A.class)
            .nest(A::b, "b", bValidator)
            .build();

    var aWithInvalidBs = new A(invalidB, List.of(invalidB));
    var aViolations = aValidator.validate(aWithInvalidBs).violations().stream().map(ConstraintViolation::messageKey).toList();
    assertAll("should be only one violation, because greaterThan(0) fails and failFast==true",
            () -> assertThat(aViolations, not(hasItem(OBJECT_EQUAL_TO.messageKey()))),
            () -> assertThat(aViolations, contains(NUMERIC_GREATER_THAN.messageKey()))
        );
}

}

The test passes the first assertions, demonstrating the failFast works when bValidator.validate(..) is used directly. The last assertion fails however, yielding the violation for equalTo("y") even though bValidator is set to failFast==true.

Assertion error:

org.opentest4j.MultipleFailuresError: should be only one violation, because greaterThan(0) fails and failFast==true (2 failures) java.lang.AssertionError: Expected: not a collection containing "object.equalTo" but: was <[numeric.greaterThan, object.equalTo]>

Changing the building of aValidator from .nest to .forEach(A::bs, "bs", bValidator) yields the same result.

YAVI version: 0.11.2

runeflobakk avatar May 16 '22 10:05 runeflobakk

@runeflobakk I think the behavior is as expected. Implementationally, failFast is determined by the flag of the parent Validator.

How does the below work?

Validator<A> aValidator = ValidatorBuilder.of(A.class)
        .nest(A::b, "b", bValidator)
        .failFast(true)
        .build();

making avatar May 16 '22 11:05 making

Oh, I was thinking that the failFast was determined on each validator. Thank you for your answer, the code you provide would work, but in general I want to collect all violations, but for certain more heavy validations, I would like to use some lightweight "guard" validations to be able to short-circuit to avoid trying to validate if I already know that the more heavy validation will fail. I guess I thought I could use failFast for more fine grained control on the constraint checking than it is designed to.

My case for using failfast is that I want some more computational heavy validations to occur only if certain preceeding light-weight validations passes.

It makes sense though, that if setting failFast on a parent validator, nested validators should also be failFast. But at the same time, it is sort of a surprise that specifying a trait (failFast=true) of a specific validator instance, to have it overridden when used by a parent validator. Or it may make sense for some cases, and not for others. In my case, I use failFast=true to achieve a certain behavior of my specific validator. Maybe the failFast property is a candidate for something more expressive than a simple boolean flag (as in true/false/unspecified)? Or maybe what I am trying to do is not applicable for the intended use of the failFast property.

What are your thoughts? Maybe there are some facility I have overlooked to combine constraints in a chain, where if one fails, the remaining one are not executed? I guess I could implement this in one CustomConstraint, but I am not sure if I will be able to produce different kinds of ViolationMessage depending on which "step" fails validation? From the top of my head, I could maybe do this with a kind of wrapper CustomConstraint which delegates to a chain of other constraints, and orchestrates this short-circuiting behaviour.

runeflobakk avatar May 16 '22 12:05 runeflobakk

Feel free to move this to discussions to avoid cluttering the issue management, or let me know, and I can move it.

runeflobakk avatar May 16 '22 12:05 runeflobakk

Discussing here is fine for me. This feature is requested by a user. https://github.com/making/yavi/issues/140 Feedback is important as I don't know how this feature will be used.

From the implementation point of view, it's simple and easy to respect failFast set in the collection validator. The collection validator is purely delegated, and the current implementation explicitly propagates the parent failFast property. https://github.com/making/yavi/blob/0.11.2/src/main/java/am/ik/yavi/core/Validator.java#L220 The fix would only erase this line.

On the other hand, the nested validator only "flattens" the constrains in the Validator Builder, which could require major code changes to use the failedFast property of the nested validator.

making avatar May 16 '22 13:05 making

Fail-fast is specifying how a single validator is behaving, and when set to true for a specific validator, it may be surprising to have this behavior cancelled depending on how the validator is used (i.e. directly or invoked by a parent validator). If the intended use of failFast is to control if one is interested in only one violation at a time, then perhaps the correct place to express it is in the actual invocation of Validator.validate(..). This way a validator may be used in various contexts where in some cases one is interested in only the first encountered violation, and other cases where one wants all violations which the validator collects.

With failFast={true|false} specified on the actual validate-invocation, one should be free to use failFast on Validator-instances for deciding how a single validator-instance should behave.

Pseudo-code examples:

  • validator.validate(.., failFast=true) effectively sets the invocation to terminate on the first constraint-violation and return it
  • validator.validate(.., failFast=false) (default) collects all constraint violations yielded by the validator and any sub-validators. Each validator in use then decides themselves if they behave as fastFast or not, the invoked validator-instance does not need to know anything about this.

There is still a choice that needs to be made on how the flag should be passed down to nested validators. I think the semantics of the failFast flag (either get first or collect all) makes it like a candidate for passing downstream only when true, because if set to true at the top level, one is only interested in the first violation regardless on how the downstream validators is defined wrt failFast or not. But if failFast=false on the top level, the downstream validators should still be free to decide if the output from themselves is a maximum of one violation, or several, and that is not a concern of the validator invoking the downstream validator.

The more I think of this, I think it makes sense to require the top-level validator to have set failFast=true if one wants to guarantee that max one violation is yielded (short-circuit on first violation). And I would expect this to be how such behavior is already set up. And if the top level (or validator on any level above another validator) has set failFast=false, it should not care if any downstream validator it invokes is configured as failFast or not.

This means I am back to propagate the parent failFast only if it set to true, and this should happen recursively when traversing a tree of validators. So specifying failFast on invocation, as I initially thought, is not be necessary at all. If one needs a fail-fast validator, one can simply invoke Validator.failFast(true).validate(..) as it is done internally by the Validator instance itself.

It should be allowed for a parent validator to decide that it wants maximum one constraint violation, but the parent should not be concerned about whether a nested validator is internally set to yield a maximum of one violation, if the parent itself is set to failFast=false. A parent validator should do Validator.failFast(true).validate(..) only if itself has failFast=true.

This was a lot of emptying my head on how I think about this. Does this make sense to you? :)

runeflobakk avatar May 16 '22 14:05 runeflobakk

From an API design perspective, it may make more sense if the Validator.failFast method is solely to turn on failFast. I.e. the invocation on Validator should read validator.failFast(), and return a new instance (as it is now) which will return maximum one constraint violation. Internally, this is most likely implemented (as it is now) to immediately yield when the first failing constraint is encountered. This also would make sense internally in Validator to propagate failFast to downstream validators, i.e. it should be invoked only in the case of the calling validator is failFast==true, and never as the current API enables as .failFast(false).

This would also make sense in a hypothetical parallel implementation of Validator. A parent validator may delegate to several sub-validators in parallel, and if failFast is turned on it will yield the first failing constraint which is encountered, even though multiple constraints may have had time to be checked. (This is just a thought experiment for rationalizing the design I am proposing. A parallel validator will probably have to deal with other challenges as well, e.g. cancelling started validator "branches" in the case of fail-fast)

The ValidatorBuilder.failFast(boolean)-method makes sense to keep as-is, because this is used at "configuration time" of a validator, and it enables parameterization if it should be fail-fast or not. If desired, a ValidatorBuilder.failFast()-method which invokes this.failFast(true) can be added, but this is not strictly necessary; it would just be a convenience.

runeflobakk avatar May 16 '22 17:05 runeflobakk

@lonre Sorry to throw this wall-of-text at you (I should probably apologize to @making as well 😅), but reading my propositions above, would this break any of your use of YAVI with failFast=true, as originally requested in #140?

In my understanding, for failFast=true to be effective in the current version of YAVI, it must be set at the top level validator, and this behavior will be preserved in the above proposed change. The proposed change is essentially to disable the ability of any parent validator to turn off failFast of a delegated validator which is already configured as failFast==true.

runeflobakk avatar May 16 '22 17:05 runeflobakk

Hi @runeflobakk It seems works fine for me, I just use it with the builder.

lonre avatar May 16 '22 19:05 lonre

I will consider redesign when developing 0.12. As I said earlier, it's unclear if this feature will be included, as it probably requires a lot of internal changes.

making avatar May 22 '22 15:05 making

I agree some internal changes is required for this to be implemented.

As you say, nested validators are flattened/"deconstructed" into constraint predicates, and thus loosing the nested validator's original failFast-setting. In my opinion, a validator should probably maintain a tree (or trees) of actual delegated validators to be able to properly compose the behavior or multiple validators. I think this will have the potential of a cleaner design. Perhaps each individual constraint predicate should really be a tiny delegated validator? Or something along that line of thinking? I think it may be a bit problematic if the API accepts Validator instances, giving the impression that validators can be properly composed, but really only some "traits" of the given Validator is preserved. Also, properly composing tree(s) of Validators I think has the potential for easier further development of behavioral features of validators, where each validators' behavior would be preserved through traversing, and does not necessarily has to be taken into account when nesting or other compositional operations.

I had a look at the internals of the Validator code, and I also realize that the evaluation of the conditional validators is sort of "cheating" the failfast-property, as it is currently implemented. If one sets constraintOnCondition(ConstraintCondition<T>, Validator<T>) and also using failFast(true), then still all of the constraints of the conditional validator will be evaluated internally, though just the first violation is actually returned. The reason for this, I presume, is that conditional validators are collected as Validatable, which does not have any method to specify failFast on them. https://github.com/making/yavi/blob/0.11.2/src/main/java/am/ik/yavi/core/Validator.java#L245-L255

Is this an issue you are aware of? Do you want me to register a separate issue for it?

runeflobakk avatar May 22 '22 19:05 runeflobakk

Yeah, I noticed it when I read the code for this issue. However, I'm not sure if I want to fix it right now as it is not determined what the behavior of failFast should be. Are you having problems with this?

making avatar May 23 '22 02:05 making

No, it is not a problem for me at this point. Just wanted to mention it :)

runeflobakk avatar May 23 '22 08:05 runeflobakk

@runeflobakk FYI, this is (finally) supported in YAVI 0.13.0

making avatar May 16 '23 08:05 making

@making That's great! Thank you!

runeflobakk avatar May 16 '23 09:05 runeflobakk