[FSTORE-862][APPEND] Don't provide the current offsets by default when starting job
This PR adds/fixes/changes...
- please summarize your changes to the code
- and make sure to include all changes to user-facing APIs
JIRA Issue: -
Priority for Review: -
Related PRs: -
How Has This Been Tested?
- [ ] Unit Tests
- [ ] Integration Tests
- [ ] Manual Tests on VM
Checklist For The Assigned Reviewer:
- [ ] Checked if merge conflicts with master exist
- [ ] Checked if stylechecks for Java and Python pass
- [ ] Checked if all docstrings were added and/or updated appropriately
- [ ] Ran spellcheck on docstring
- [ ] Checked if guides & concepts need to be updated
- [ ] Checked if naming conventions for parameters and variables were followed
- [ ] Checked if private methods are properly declared and used
- [ ] Checked if hard-to-understand areas of code are commented
- [ ] Checked if tests are effective
- [ ] Built and deployed changes on dev VM and tested manually
- [x] (Checked if all type annotations were added and/or updated appropriately)
Are you sure this PR is correct? Looks to me that we always end up in the first branch of the if statement as the initial_check_point is always not empty string as we set it here:
https://github.com/logicalclocks/feature-store-api/blame/25cfcd57ad792a3b6a732570943692c49b406fbc/python/hsfs/engine/python.py#L956
It doens't seem there is a way of controlling the skip_offset parameter in the first branch and the offsets are always skipped.
Additionally the skip_offset parameter is not documented anywhere in the APIs. Please add the proper documentation in the insert method.
I think everything should be correct.
initial_check_point will be empty if topic doesn't exists (for example after upgrade).
In the first if statement (here: https://github.com/logicalclocks/feature-store-api/blame/25cfcd57ad792a3b6a732570943692c49b406fbc/python/hsfs/engine/python.py#L1016) we always run materialization job setting the initial offset to 0 for all partitions of topic since the topic didn't exist and job should start from the beginning (done here: https://github.com/logicalclocks/feature-store-api/blame/25cfcd57ad792a3b6a732570943692c49b406fbc/python/hsfs/engine/python.py#L1028)
@SirOibaf I also change the skip_offset parameter name to use_current_offsets as i think its more descriptive.