nifi icon indicating copy to clipboard operation
nifi copied to clipboard

NIFI-15077 : Added conflict resolution strategies for file move operations in FetchFileTransfer

Open ravinarayansingh opened this issue 3 months ago • 7 comments

Summary

NIFI-15077

  • Introduced MOVE_CONFLICT_RESOLUTION property to handle filename collisions during file move
  • Implemented logic for conflict resolution strategies (REPLACE, RENAME, IGNORE, etc.)
  • Refactored filename normalization and path handling utilities
  • Updated FetchFTP and FetchSFTP processors for enhanced conflict resolution support
  • Added corresponding unit tests and utility methods for unique filename generation in conflicts

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • [ ] Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • [ ] Pull Request based on current revision of the main branch
  • [ ] Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • [ ] Build completed using ./mvnw clean install -P contrib-check
    • [ ] JDK 21
    • [ ] JDK 25

Licensing

  • [ ] New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • [ ] New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • [ ] Documentation formatting appears as expected in rendered files

ravinarayansingh avatar Oct 09 '25 03:10 ravinarayansingh

Thanks for proposing this improvement @ravinarayansingh. The concept makes sense, and the basic approach looks good. I noted some recommendations on a few implementation and logging details.

Thanks for review, @exceptionfactory. I’ve implemented the suggested changes. Please take another look when you have a chance.

ravinarayansingh avatar Oct 23 '25 04:10 ravinarayansingh

Thanks for the updates @ravinarayansingh. This looks better, but handling IGNORE, REJECT, and FAIL with the same case does not seem correct. The IGNORE and NONE options should not log a warning as previously mentioned, and the other two options require routing to a different location.

Thanks for the feedback, @exceptionfactory. Just to confirm, for the IGNORE and NONE conflict resolution strategies, I’ll keep the log level as INFO, like this:

case FileTransfer.CONFLICT_RESOLUTION_IGNORE:
case FileTransfer.CONFLICT_RESOLUTION_NONE:
    getLogger().info("Configured to {} on move conflict for {}. Original remote file will be left in place.", strategy, flowFile);
    return;
                            

For the other two options — REJECT and FAIL — should these also log a warning, or should they additionally route the FlowFile to a different relationship (for example, failure or reject)?

ravinarayansingh avatar Oct 29 '25 07:10 ravinarayansingh

Thanks for the updates @ravinarayansingh. This looks better, but handling IGNORE, REJECT, and FAIL with the same case does not seem correct. The IGNORE and NONE options should not log a warning as previously mentioned, and the other two options require routing to a different location.

Thanks for the feedback, @exceptionfactory. Just to confirm, for the IGNORE and NONE conflict resolution strategies, I’ll keep the log level as INFO, like this:

case FileTransfer.CONFLICT_RESOLUTION_IGNORE:
case FileTransfer.CONFLICT_RESOLUTION_NONE:
    getLogger().info("Configured to {} on move conflict for {}. Original remote file will be left in place.", strategy, flowFile);
    return;
                            

Yes, for IGNORE and NONE, and INFO log looks good.

For the other two options — REJECT and FAIL — should these also log a warning, or should they additionally route the FlowFile to a different relationship (for example, failure or reject)?

Those options should log a warning, and route to the appropriate relationship, based on the description for each value.

exceptionfactory avatar Oct 29 '25 13:10 exceptionfactory

Hi @exceptionfactory I have refactored the FetchFileTransfer processor to improve post-commit and pre-commit handling logic. Centralized MOVE and DELETE completion strategies, introduced detailed failure reasons for better debugging, and added new REL_REJECT and REL_FAILURE relationships for more robust error management. Updated corresponding unit tests to align with the revised flow. please have a look

ravinarayansingh avatar Oct 30 '25 22:10 ravinarayansingh

Thanks for the updates @ravinarayansingh.

It looks like the changes resulted in some unit test failures:

TestFTP.testFetchFtpUnicodeFileName:350 expected: <1> but was: <0>
TestFTP.testFetchFtp:257 » IndexOutOfBounds Index 0 out of bounds for length 0

I should have considered this in my previous reply, but the changes surface the fact that the FetchFTP and FetchSFTP Processors do not have failure and reject relationships. They have more specific types of failure relationships, (not found, permission denied, and comms failure), but those do not immediately align with the other failure conditions.

Introducing new relationships creates potential migration problems if not handled programmatically, such as auto-terminating or migrating. That could be done, but on balance, adding new relationships seems to introduce additional complexity for a conditional scenario, which is not ideal.

Taking another look at the options, I think the reject and fail completion strategies should be removed and not supported in this Processor. Instead, limiting support to ignore or rename would avoid introducing new relationships, and still provide more flexibility.

How does that sound?

Hi @exceptionfactory That makes perfect sense, I agree with you. Limiting the completion strategies to ignore and rename sounds like the right approach. It keeps the design simpler and avoids unnecessary migration or compatibility issues.

ravinarayansingh avatar Oct 31 '25 18:10 ravinarayansingh

Hi @exceptionfactory

I’ve updated the code with the following changes to simplify the FetchFileTransfer completion strategy logic:

  • Removed redundant conflict resolution strategies (REJECT, FAIL) and relationships (REL_REJECT, REL_FAILURE).
  • Refactored MOVE conflict handling to streamline conditions and improve clarity.
  • Introduced getSimpleFilename utility for filename parsing.
  • Updated logging to prioritize successful processing while handling edge cases gracefully.

please have look

ravinarayansingh avatar Oct 31 '25 21:10 ravinarayansingh

Thanks for the updates @ravinarayansingh.

Unfortunately this change appears to be running into to some fundamental limitations of the current design of FetchFileTransfer. Pushing down transfer relationship handling to separate methods, introducing short-circuit returns, and passing around large numbers of method arguments highlight some of the challenges.

It is noteworthy that the Completion Strategy property description indicates that if the strategy fails, a warning will be logged. With that in mind, it seems like the change should much more localized, avoiding FlowFile transfer and other operations in short-circuit methods.

Thanks for the feedback @exceptionfactory

I’ve refactored the implementation to make the handling more localized and aligned with the intended design of FetchFileTransfer. Specifically:

  • Moved DELETE and MOVE strategies to a centralized pre-commit phase for consistent completion handling.
  • Introduced detailed failure reason constants to clearly identify each failure scenario.
  • Updated FlowFile routing to use REL_PERMISSION_DENIED and REL_COMMS_FAILURE based on the specific failure type, improving diagnostic clarity.
  • Enhanced unit tests to cover the updated completion strategy logic and verify the new failure conditions.

This refactoring keeps the behavior consistent with the Completion Strategy description while minimizing side effects and improving maintainability.

ravinarayansingh avatar Nov 06 '25 08:11 ravinarayansingh