RepoSense [#2141] Use optionals for return values that may be null to enforce a client check

Part of #2141

Proposed commit message

Currently, `null`s are used within some parts of the code. While it is
not a bad practice to use `null`s within the code, it can be
problematic when `null` values are present in places where `null` is
not allowed.

This PR aims to begin the implementation of `Failable` monad,
unit tests for `Failable` monad, as well as the refactoring of
some areas of the code where `null` is explicitly used as predicates or
as a flag for starting other processes.

Other information

Main Rationale Behind FailableOptional Monad

Currently, there are chunks of code whereby nulls are used as a flag or are part of a process whereby exceptions are thrown or messages are logged. The built-in Java Optional class (in Java 8) does not support some of the operations or require lengthy workarounds to replicate the behaviour we want directly using Optional.

For example, consider CommitInfoAnalyzer::analyzeCommit:

public CommitResult analyzeCommit(CommitInfo commitInfo, RepoConfiguration config) {
        // truncated
        ZonedDateTime date = null;
        try {
            date = ZonedDateTime.parse(elements[DATE_INDEX], GIT_STRICT_ISO_DATE_FORMAT);
        } catch (DateTimeParseException pe) {
            logger.log(Level.WARNING, "Unable to parse the date from git log result for commit.", pe);
        }

        // Commit date may be in a timezone different from the one given in the config.
        LocalDateTime adjustedDate = date.withZoneSameInstant(config.getZoneId()).toLocalDateTime();

        // truncated

        return new CommitResult(author, hash, isMergeCommit, adjustedDate, messageTitle, messageBody, tags,
                fileTypeAndContributionMap);
    }

Using null in this manner is dangerous if DateTimeParseException is thrown, since date would have the value null, and the subsequent invocation of date.withZoneSameInstant would fail due to a NullPointerException. Optional could be used here, but it is rather clunky because Optional's interface does not handle the production of values that might throw exceptions. Moreover, the other mapping and filtering methods do not allow lambdas or methods which throw exceptions.

The use of FailableOptional can help solve this, while mitigating the risk of invoking methods on null, by wrapping up the function call that potentially throws an exception, and working with it as if it had not thrown, shielding the user from the complexity of handling the null values.

Consider this rewritten chunk of code:

public CommitResult analyzeCommit(CommitInfo commitInfo, RepoConfiguration config) {
        // truncated
        // safe map since ZonedDateTime::now returns non-null
        Failable<LocalDateTime> date = Failable
                .ofNullable(() -> ZonedDateTime.parse(elements[DATE_INDEX], GIT_STRICT_ISO_DATE_FORMAT))
                .ifFail(x ->
                        logger.log(Level.WARNING, "Unable to parse the date from git log result for commit.", x))
                .resolve(x -> {}, ZonedDateTime.now())
                .map(x -> x.withZoneSameInstant(config.getZoneId()).toLocalDateTime());

        // truncated

        return new CommitResult(author, hash, isMergeCommit, date.get(), messageTitle, messageBody, tags,
                fileTypeAndContributionMap);
    }

This following implementation would not be subjected to the above buggy behaviour, as the potentially throwing method is wrapped into a Fail-ed instance of Failable, and subsequently recovered into a valid default LocalDateTime value.

So far, Failable supports the usual operations that Optional can do (identity (of, ofNullable), mapping (map, flatmap) and filtering (filter)), and support new operations that might be useful in our codebase (handling methods that might throw exceptions when executed but otherwise returns a value, etc.).

So far, the following methods have been implemented:

Identity

of
ofNullable
success
empty
fail
isPresent
isAbsent
isFailed

Mapping

map
unfailableMap
flatMap
unfailableFlatMap

Filtering

filter

Getters

get
orElse

Misc. methods

ifPresent
ifAbsent
ifFail
resolve
ifFailThenThrow
failWith

Here are some of the main processes that we need to complete to close out this Issue/PR. The relevant PRs which fulfil the tasks are included in the parenthesis following the task name below:

[X] Implement Failable monad (#2144)
[x] Implement test cases for Failable (#2144)
[ ] [IN PROGRESS] Identify areas where Failable can be used (#2144)
[ ] [IN PROGRESS] Implement Failable in key areas identified (#2144)
[ ] [IN PROGRESS] Update overall test cases (#2144)
[ ] Bring code coverage back up to acceptable levels

Mar 04 '24 08:03 georgetayqy

In line with keeping PRs small and working on issues incrementally, I have decided to stop working on the implementation of FailableOptional for now and limit this PR to a proof of concept / potential implementation of the Optional alternatives we can use.

Future PRs will look into the implementation of FailableOptional in other places in the codebase wherever needed.

Mar 09 '24 14:03 georgetayqy

Just an initial thought: It would be nice if exceptions were part of the Failable optional type (e.g FailableOptional<LocalDateTime, DateTimeParseException>) which indicates that there exists the possibility of a throw which is still unhandled. Composing FailableOptionals with different unhandled exceptions should produce a FailableOptional with the sum type of these exceptions (this may be hard to do with Java 8/11). Handling the error, then gets rid of the exception from the type.

Inspiration for monads of this kind is from Scala Zio and effect-ts.

Mar 18 '24 06:03 gok99

@gok99 Instead of compositing types like that, perhaps we could store exceptions thrown in a list in a failed instance instead? Every time the FailableOptional is mapped, the exceptions are carried over into the mapped instance instead? A special map function could also be used to add on new possibly unhandled exceptions too:

e.g.

@Override
public <U> FailableOptional<U> tryWith(ThrowingFunction<T, U, E> function, Exception ex) {
    // if failed, add ex to thrown exception list, replicate it on new instance of fail?
    return new Fail<>(this.thrownExceptionsList, ex);
}

One of the problems with this is that I cannot retreive Class information from Generic types, so I would need users to specify the exception class that can be thrown explicitly in order to save it.

Mar 18 '24 12:03 georgetayqy

@asdfghjkxd I suppose this allows for a runtime check that all exceptions are handled, but doing checks like that at runtime doesn't seem like something we'd want to do.

Exception handling checks at runtime seen a lot less useful, so given the awkwardness of having to separately pass in exceptions, may not be worth doing. Let me know if there's a use case you think will be useful.

Mar 19 '24 01:03 gok99

@gok99 I see, so far, I don't really think that there is much use passing in exceptions for checks as of yet; most of our functions that can benefit from FailableOptional appears to only throw once, rendering the saving of Exceptions rather meaningless.

I will look into the above recommendations that you have provided on the Scala ZIO monad, and update this PR with new code/findings!

Mar 19 '24 03:03 georgetayqy

I will look into the above recommendations that you have provided on the Scala ZIO monad, and update this PR with new code/findings!

Sure, you can give it a shot, but no pressure to make it work - Java does handicap us quite a bit

Mar 19 '24 04:03 gok99

@gok99 I have implemented a new Failable monad which is largely based on Scala's ZIO.

There are some compromises that need to be made:

"Reflective" access is required to assert the type of exception thrown by any of the Throwable* interface is indeed of the correct type (though I'm not so sure about this, perhaps you could advise me more on how I can achieve this? I could do an unchecked cast from the Throwable type catch to the generic type, which should be safe enough in principle since the type is already declared by the Throwable* interfaces)

private static <T, E extends Throwable> Failable<T, E> ifFailElseThrow(Throwable throwable) {
    if (new Fail<>(throwable).getExceptionClass().isInstance(throwable)) {
        @SuppressWarnings("unchecked")
        Failable<T, E> failed = (Failable<T, E>) new Fail<>(throwable);
        return failed;
    }

    throw new ClassCastException("Exception class does not match specifications");
}

When a failed instance is flatMap-ed, there is an implicit assumption that the user knows that the type of exception stored within the instance is of the original exception type and that the flatmap should not change the exception type into something that is not a subclass of the original exception type. We could restrict the type of exceptions thrown by functions used for flatmap-ping to exceptions that are subclasses of the original exception type, and permit exception type changes only with the failWith method (which requires users to specify the exception instance to replace the one stored in the class), but I would like to hear your opinion on this.

public <U, Z extends Throwable> Failable<U, Z> flatMap(
        ThrowableFunction<? super T, ? extends Failable<? extends U, Z>, Z> throwableFunction) {
    return Failable.ifFailElseThrow(this.exception);
}

There is a non-throwing version of flatMap that uses Java's functional interface Function, but I am not sure if this alone would suffice.

Mar 21 '24 15:03 georgetayqy

"Reflective" access is required to assert the type of exception thrown by any of the Throwable* interface is indeed of the correct type

Could you describe why the reflective access is required? Specifically, is there a situation in which it is possible for new Fail<>(throwable).getExceptionClass().isInstance(throwable) to be false?

there is an implicit assumption that the user knows that the type of exception stored within the instance is of the original exception type and that the flatmap should not change the exception type into something that is not a subclass of the original exception type

Do you mean that we aren't currently checking for this but we expect that the user maintains this discipline? I think this is not in general very realistic and in most cases would require big refactors of exceptions to make sure they can compose. Scala ZIO has the expressivity of sum types so the handling of composed exceptions and exception handling is made much more natural. Maybe a compromise is to assert that flatMaps are done only with CannotFailExceptions, so exceptions need to be handled promptly. Java technically also has tuple types that we might be able to use to patch together exception composition, but that would get very messy quickly. What do you think?

Mar 24 '24 15:03 gok99

@gok99 Reflective access was previously required to ensure that the functions that users define can only throw the declared type or subtype of exceptions. However, upon further testing, I realised that the function isn't working as intended; all exceptions are instances of Throwable and hence no exceptions are thrown.

For now, flatmap requires users to not throw incompatible types of exception in the methods. Union types are not available in Java 11 (union types are only available in Java 17 through the sealed interface feature, but even then it does not offer the same kind of type unionisation that we need for our use case), and using tuple types seems viable for maintaining type safety, but it does seem rather clunky and does not appear to offer type safety if you wish to obtain a stored value. I could explore it if you wish, and report back with any findings that I may have.

But in the meantime, perhaps we could restrict flatmap to only map to the same exception type, and require users to explicitly define a new exception type if necessary?

Mar 25 '24 04:03 georgetayqy

As discussed, Java is probably not well equipped for these kinds of type gymnastics. Let's instead (of paramaterizing with exceptions) require that throwing optionals are given default handlers (that can be dynamically rebound later) as a compromise.

Mar 30 '24 18:03 gok99

Hi, We are going to mark this PR as stale because it has been inactive for the past 30 days. If no further activity occurs within the following 7 days, it will be automatically closed so that others can take up the issue. If you are still working on this PR, please make a follow-up commit within 7 days and leave a comment to remove the stale label. Do let us know if you are stuck so that we can help you!'

May 05 '24 00:05 github-actions[bot]

Hi, We are going to mark this PR as stale because it has been inactive for the past 30 days. If no further activity occurs within the following 7 days, it will be automatically closed so that others can take up the issue. If you are still working on this PR, please make a follow-up commit within 7 days and leave a comment to remove the stale label. Do let us know if you are stuck so that we can help you!'

Jun 06 '24 00:06 github-actions[bot]

Hi, We are going to mark this PR as stale because it has been inactive for the past 30 days. If no further activity occurs within the following 7 days, it will be automatically closed so that others can take up the issue. If you are still working on this PR, please make a follow-up commit within 7 days and leave a comment to remove the stale label. Do let us know if you are stuck so that we can help you!'

Jul 10 '24 00:07 github-actions[bot]

This PR was closed because it has been marked as stale for 7 days with no activity. Feel free to reopen this PR if you would like to continue working on it.

Jul 17 '24 00:07 github-actions[bot]

RepoSense RepoSense copied to clipboard

[#2141] Use optionals for return values that may be null to enforce a client check

Proposed commit message

Other information

RepoSense
RepoSense copied to clipboard