NIFI-15077 : Added conflict resolution strategies for file move operations in FetchFileTransfer
Summary
- Introduced MOVE_CONFLICT_RESOLUTION property to handle filename collisions during file move
- Implemented logic for conflict resolution strategies (REPLACE, RENAME, IGNORE, etc.)
- Refactored filename normalization and path handling utilities
- Updated FetchFTP and FetchSFTP processors for enhanced conflict resolution support
- Added corresponding unit tests and utility methods for unique filename generation in conflicts
Tracking
Please complete the following tracking steps prior to pull request creation.
Issue Tracking
- [ ] Apache NiFi Jira issue created
Pull Request Tracking
- [ ] Pull Request title starts with Apache NiFi Jira issue number, such as
NIFI-00000 - [ ] Pull Request commit message starts with Apache NiFi Jira issue number, as such
NIFI-00000
Pull Request Formatting
- [ ] Pull Request based on current revision of the
mainbranch - [ ] Pull Request refers to a feature branch with one commit containing changes
Verification
Please indicate the verification steps performed prior to pull request creation.
Build
- [ ] Build completed using
./mvnw clean install -P contrib-check- [ ] JDK 21
- [ ] JDK 25
Licensing
- [ ] New dependencies are compatible with the Apache License 2.0 according to the License Policy
- [ ] New dependencies are documented in applicable
LICENSEandNOTICEfiles
Documentation
- [ ] Documentation formatting appears as expected in rendered files
Thanks for proposing this improvement @ravinarayansingh. The concept makes sense, and the basic approach looks good. I noted some recommendations on a few implementation and logging details.
Thanks for review, @exceptionfactory. I’ve implemented the suggested changes. Please take another look when you have a chance.
Thanks for the updates @ravinarayansingh. This looks better, but handling
IGNORE,REJECT, andFAILwith the same case does not seem correct. The IGNORE and NONE options should not log a warning as previously mentioned, and the other two options require routing to a different location.
Thanks for the feedback, @exceptionfactory. Just to confirm, for the IGNORE and NONE conflict resolution strategies, I’ll keep the log level as INFO, like this:
case FileTransfer.CONFLICT_RESOLUTION_IGNORE:
case FileTransfer.CONFLICT_RESOLUTION_NONE:
getLogger().info("Configured to {} on move conflict for {}. Original remote file will be left in place.", strategy, flowFile);
return;
For the other two options — REJECT and FAIL — should these also log a warning, or should they additionally route the FlowFile to a different relationship (for example, failure or reject)?
Thanks for the updates @ravinarayansingh. This looks better, but handling
IGNORE,REJECT, andFAILwith the same case does not seem correct. The IGNORE and NONE options should not log a warning as previously mentioned, and the other two options require routing to a different location.Thanks for the feedback, @exceptionfactory. Just to confirm, for the IGNORE and NONE conflict resolution strategies, I’ll keep the log level as INFO, like this:
case FileTransfer.CONFLICT_RESOLUTION_IGNORE: case FileTransfer.CONFLICT_RESOLUTION_NONE: getLogger().info("Configured to {} on move conflict for {}. Original remote file will be left in place.", strategy, flowFile); return;
Yes, for IGNORE and NONE, and INFO log looks good.
For the other two options — REJECT and FAIL — should these also log a warning, or should they additionally route the FlowFile to a different relationship (for example, failure or reject)?
Those options should log a warning, and route to the appropriate relationship, based on the description for each value.
Hi @exceptionfactory I have refactored the FetchFileTransfer processor to improve post-commit and pre-commit handling logic. Centralized MOVE and DELETE completion strategies, introduced detailed failure reasons for better debugging, and added new REL_REJECT and REL_FAILURE relationships for more robust error management. Updated corresponding unit tests to align with the revised flow. please have a look
Thanks for the updates @ravinarayansingh.
It looks like the changes resulted in some unit test failures:
TestFTP.testFetchFtpUnicodeFileName:350 expected: <1> but was: <0> TestFTP.testFetchFtp:257 » IndexOutOfBounds Index 0 out of bounds for length 0I should have considered this in my previous reply, but the changes surface the fact that the
FetchFTPandFetchSFTPProcessors do not havefailureandrejectrelationships. They have more specific types of failure relationships, (not found, permission denied, and comms failure), but those do not immediately align with the other failure conditions.Introducing new relationships creates potential migration problems if not handled programmatically, such as auto-terminating or migrating. That could be done, but on balance, adding new relationships seems to introduce additional complexity for a conditional scenario, which is not ideal.
Taking another look at the options, I think the
rejectandfailcompletion strategies should be removed and not supported in this Processor. Instead, limiting support toignoreorrenamewould avoid introducing new relationships, and still provide more flexibility.How does that sound?
Hi @exceptionfactory That makes perfect sense, I agree with you. Limiting the completion strategies to ignore and rename sounds like the right approach. It keeps the design simpler and avoids unnecessary migration or compatibility issues.
Hi @exceptionfactory
I’ve updated the code with the following changes to simplify the FetchFileTransfer completion strategy logic:
- Removed redundant conflict resolution strategies (REJECT, FAIL) and relationships (REL_REJECT, REL_FAILURE).
- Refactored MOVE conflict handling to streamline conditions and improve clarity.
- Introduced
getSimpleFilenameutility for filename parsing. - Updated logging to prioritize successful processing while handling edge cases gracefully.
please have look
Thanks for the updates @ravinarayansingh.
Unfortunately this change appears to be running into to some fundamental limitations of the current design of
FetchFileTransfer. Pushing down transfer relationship handling to separate methods, introducing short-circuit returns, and passing around large numbers of method arguments highlight some of the challenges.It is noteworthy that the
Completion Strategyproperty description indicates that if the strategy fails, a warning will be logged. With that in mind, it seems like the change should much more localized, avoiding FlowFile transfer and other operations in short-circuit methods.
Thanks for the feedback @exceptionfactory
I’ve refactored the implementation to make the handling more localized and aligned with the intended design of FetchFileTransfer. Specifically:
- Moved DELETE and MOVE strategies to a centralized pre-commit phase for consistent completion handling.
- Introduced detailed failure reason constants to clearly identify each failure scenario.
- Updated FlowFile routing to use
REL_PERMISSION_DENIEDandREL_COMMS_FAILUREbased on the specific failure type, improving diagnostic clarity. - Enhanced unit tests to cover the updated completion strategy logic and verify the new failure conditions.
This refactoring keeps the behavior consistent with the Completion Strategy description while minimizing side effects and improving maintainability.