fluentassertions icon indicating copy to clipboard operation
fluentassertions copied to clipboard

`BeUpperCased()` and `BeLowerCased()` semantics are confusing

Open drewjcooper opened this issue 1 year ago • 24 comments

Description

I came across this issue recently and the discussion in #1768 highlights the confusion as well.

The names BeUpperCased() and BeLowerCased() imply to me an assertion that the subject string is in a form producible by ToUpper() or ToLower(), respectively. The example below illustrates the semantics I expect.

I'm going to suggest a breaking change to include in the next major version of FluentAssertions. I'm happy to work on this if accepted.

Can we change the implementation of BeUpperCased() and BeLowerCased() to match the semantics suggested by their names?

NotBeUpperCased()/NotBeLowerCased() would then change to assert that there are some lower-case/upper-case letters in the string.

We could add new assertions, maybe BeAllLowerCase()/BeAllUpperCase(), with the current semantics of the existing methods.

Complete minimal example reproducing the issue

var someString = /* any string value */;

var result = someString.ToLower();

result.Should().BeLowerCased();

Expected behavior:

I would expect this assertion to pass for any value of someString.

Actual behavior:

The assertion fails if there are non-letter characters in the string.

drewjcooper avatar Oct 19 '22 21:10 drewjcooper

Hi Andrew,

Do I understand it correctly that you want to introduce BeAllLowerCased() and BeAllUpperCased(), but with the current semantics of BeLowerCased() and BeUpperCased()? And then what do you consider to be the correct semantics for BeLowerCased() and BeUpperCased(), the example you provided at the bottom?

And lastly, you want that NotBeLowerCased() and NotBeUpperCased() asserts that any character in a string is not in lower casing or upper casing respectively?

Cheers!

rubenschild avatar Oct 25 '23 14:10 rubenschild

Ruben,

Essentially, yes. But the new methods should be present-tense - BeAllLowerCase() and BeAllUpperCase() (without the 'd'). It's an assertion about the contents of the string with no other context.

The existing past-tense methods (BeLowerCased(), BeUpperCased()) imply an assertion that the string is the result of a previous operation, namely ToLower() or ToUpper().

I would be happy to supply tests and an implementation in a PR if the project is happy to make this change.

drewjcooper avatar Oct 25 '23 22:10 drewjcooper

I think the original intent was to change the behavior of the existing methods so they use ToLower() under the hood. But adding two additional methods to ensure all characters are lower/upper cased could be useful.

dennisdoomen avatar Oct 26 '23 05:10 dennisdoomen

Regarding the NotBeLowerCased() and NotBeUpperCased() changes suggested by @drewjcooper, wouldn't it be a better idea to also introduce new assertions for these, that asserts whether any character in a string is either lower case or upper case, instead of changing the current behaviour? Otherwise that would mean major changes for developers using those assertions in their work. Those new assertions would then be named NotContainLowerCasing() and NotContainUpperCasing(), or some other name of course.

rubenschild avatar Oct 30 '23 08:10 rubenschild

Those new assertions would then be named NotContainLowerCasing() and NotContainUpperCasing(), or some other name of course

That would be much more self-explanatory. Although I suspect, it's not proper English.

dennisdoomen avatar Oct 30 '23 09:10 dennisdoomen

Although I suspect, it's not proper English.

I agree. What about NotHaveLowerCasing() and NotHaveUpperCasing(), or NotHaveAnyLowerCasing() and NotHaveAnyUpperCasing()?

Also curious what @drewjcooper thinks about this, and the naming.

rubenschild avatar Oct 30 '23 10:10 rubenschild

Regarding the NotBeLowerCased() and NotBeUpperCased() changes suggested by @drewjcooper, wouldn't it be a better idea to also introduce new assertions for these, that asserts whether any character in a string is either lower case or upper case, instead of changing the current behaviour?

The whole point of this issue is to change current behaviour, because the current behaviour does not (to my mind) match the semantics of the method names. I recognise that this is a breaking change, but that's what semver and release notes are for.

I first stumbled across this issue when writing a unit test that asserted that a user-provided string had been passed through .ToLower() before being persisted. In this particular case the string was an email address. I expected result.Should().BeLowerCased() to do what I wanted, but it failed unexpectedly because the string contained '@' and '.' which are not lower-case alpha chars.

The current implementation of .BeLowerCased() asserts that every character in the subject string is a lower-case alpha char, but that's not what I expect based on the method name.

The changes I'm proposing are:

method assertion semantics
BeLowerCased() subject is the result of a .ToLower() operation, or subject does not contain upper-case alpha chars
BeUpperCased() subject is the result of a .ToUpper() operation, or subject does not contain lower-case alpha chars
BeAllLowerCase() subject contains only lower-case alpha chars
BeAllUpperCase() subject contains only upper-case alpha chars
NotBeAllLowerCase() subject contains at least one char that is not a lower-case alpha
NotBeAllUpperCase() subject contains at least one char that is not an upper-case alpha

I'm not sure that the negative assertions, NotBeLowerCased() and NotBeUpperCased() make sense with these changes. In this case NotBeLowerCased() would mean "subject is not the result of a .ToLower() operation", but how do you assert that? If you're checking for at least one upper-case char then a subject string like "abcd" would fail the assertion even though the implementation does not call ToLower(), and is therefore correct.

drewjcooper avatar Oct 31 '23 01:10 drewjcooper

@drewjcooper I like this. What do you think @jnyrup ?

dennisdoomen avatar Oct 31 '23 08:10 dennisdoomen

@dennisdoomen It was good to meet you at NDC Sydney. Any update on this?

drew-cooper avatar Mar 05 '24 23:03 drew-cooper

@jnyrup any thoughts?

dennisdoomen avatar Mar 06 '24 07:03 dennisdoomen

I agree in changing the semantics of the current methods.

Before brain storming about all the methods we could add, I'd like to flip the question to: what is the least change we could do to fix this?

E.g. would this suffice?

Method Semantics
BeLowerCased subject is equal to subject.ToLowerInvariant()
BeUpperCased subject is equal to subject.ToUpperInvariant()
NotBeLowerCased Same as BeUpperCased
NotBeUpperCased Same as BeLowerCased

jnyrup avatar Mar 06 '24 16:03 jnyrup

would this suffice?

I don't think so.

The new semantics suggested are as you said for the first two methods:

Method Semantics
BeLowerCased subject is equal to subject.ToLowerInvariant()
BeUpperCased subject is equal to subject.ToUpperInvariant()

There may be people who are currently using BeLowerCased/BeUpperCased/NotBeLowerCased/NotBeUpperCased for their current semantics. To ease this transition I think we need methods they can switch to with names more closely matching the semantics. so:

Method Semantics
BeAllLowerCase() subject contains only lower-case alpha chars
BeAllUpperCase() subject contains only upper-case alpha chars
NotBeAllLowerCase() subject contains at least one char that is not a lower-case alpha
NotBeAllUpperCase() subject contains at least one char that is not an upper-case alpha

The semantics you list for NotBeLowerCased/NotBeUpperCased don't work because we want mixed case strings to succeed. I'm not sure these assertions make sense with these changes. In this case NotBeLowerCased would mean "subject is not the result of a .ToLowerInvariant() operation", but how do you assert that? If you're checking for at least one upper-case char then a subject string like "abcd" would fail the assertion even though the implementation does not call ToLowerInvariant(), and is therefore correct. Of course could leave it to the developer to use meaningful data in their tests, and then the semantic should be:

Method Semantics
NotBeLowerCased subject is not equal to subject.ToLowerInvariant()
NotBeUpperCased subject is not equal to subject.ToUpperInvariant()

drewjcooper avatar Mar 07 '24 03:03 drewjcooper

but how do you assert that?

Am I missing something?

Execute.Assertion
            .ForCondition(Subject != Subject.ToLowerInvariant())
            .BecauseOf(because, becauseArgs)
            .FailWith("...");

ITaluone avatar Mar 13 '24 10:03 ITaluone

To ease this transition I think we need methods they can switch to with names more closely matching the semantics.

The core library doesn't need every method imaginable. The few (I suspect) people who need the old semantics are free to copy over the current implementation.

I think it might easier at this point to fill out a truth table. Here's my take of what we've discussed.

  • * is what we seem to agree on changing
  • (T) is my interpretation of how the negative assertion should work on mixed casing string
  • ? is where I'm unsure.
"a" "A" "1" "aA"
BeUpperCased F T T* F
BeLowerCased T F T* F
NotBeUpperCased T F ? (T)
NotBeLowerCased F T ? (T)

jnyrup avatar Mar 13 '24 17:03 jnyrup

I agree with all of that @jnyrup. I do think that 1 is neither uppercased, nor lowercased, so my conclusion is that those should never cause a failure. In your table, that would mean a T.

dennisdoomen avatar Mar 13 '24 19:03 dennisdoomen

So, will 漢字 return false to all?

eNeRGy164 avatar Mar 13 '24 20:03 eNeRGy164

I agree with @jnyrup's truth table and @dennisdoomen's additions.

So, will 漢字 return false to all?

No. This is a string that doesn't change under To(Lower|Upper)Invariant() operations so, like "1", it would succeed for all four assertions. The full truth table for the planned semantics is:

  "a" "A" "1" "aA" "漢字"
BeUpperCased F T T F T
BeLowerCased T F T F T
NotBeUpperCased T F T T T
NotBeLowerCased F T T T T

drewjcooper avatar Mar 15 '24 03:03 drewjcooper

Agreed

dennisdoomen avatar Mar 15 '24 18:03 dennisdoomen

@jnyrup you agree to move this issue to "api-approved"?

dennisdoomen avatar Mar 15 '24 18:03 dennisdoomen

@eNeRGy164 I don't know what you're hinting at and I don't know any Asian languages, but I do know you usually have good points. Care to elaborate?

jnyrup avatar Mar 18 '24 20:03 jnyrup

Most Asian languages have no concept of upper or lowercase. So, you could argue that returning true is not correct.

I know most programming is very US/English centric, so we might ignore it. But it makes these methods quite problematic to get right where the naming makes sense and is correct. As these are judging the string contents, which could be any language, or no language at all.

eNeRGy164 avatar Mar 18 '24 20:03 eNeRGy164

Most Asian languages have no concept of upper or lowercase. So, you could argue that returning true is not correct.

TIL - thanks!

I think the viable path from here is to have sufficient documentation how these methods asserts the subject string is equal to ToUpperInvariant()/ToLowerInvariant().

jnyrup avatar Mar 18 '24 21:03 jnyrup

@dennisdoomen @jnyrup Is it okay if I start on a PR for this?

drewjcooper avatar May 03 '24 11:05 drewjcooper

Sure. Go ahead.

dennisdoomen avatar May 03 '24 13:05 dennisdoomen