HPCC-Platform
HPCC-Platform copied to clipboard
HPCC-28757 New StSizePeakEphemeral and StSizePeakSubgraphTemp for hash dedup
Type of change:
- [ ] This change is a bug fix (non-breaking change which fixes an issue).
- [x] This change is a new feature (non-breaking change which adds functionality).
- [ ] This change improves the code (refactor or other change that does not change the functionality)
- [ ] This change fixes warnings (the fix does not alter the functionality or the generated code)
- [ ] This change is a breaking change (fix or feature that will cause existing behavior to change).
- [ ] This change alters the query API (existing queries will have to be recompiled)
Checklist:
- [x] My code follows the code style of this project.
- [x] My code does not create any new warnings from compiler, build system, or lint.
- [x] The commit message is properly formatted and free of typos.
- [ ] The commit message title makes sense in a changelog, by itself.
- [ ] The commit is signed.
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly, or...
- [ ] I have created a JIRA ticket to update the documentation.
- [ ] Any new interfaces or exported functions are appropriately commented.
- [x] I have read the CONTRIBUTORS document.
- [x] The change has been fully tested:
- [ ] I have added tests to cover my changes.
- [ ] All new and existing tests passed.
- [ ] I have checked that this change does not introduce memory leaks.
- [ ] I have used Valgrind or similar tools to check for potential issues.
- [ ] I have given due consideration to all of the following potential concerns:
- [ ] Scalability
- [ ] Performance
- [ ] Security
- [ ] Thread-safety
- [ ] Cloud-compatibility
- [ ] Premature optimization
- [ ] Existing deployed queries will not be broken
- [ ] This change fixes the problem, not just the symptom
- [ ] The target branch of this pull request is appropriate for such a change.
- [ ] There are no similar instances of the same problem that should be addressed
- [ ] I have addressed them here
- [ ] I have raised JIRA issues to address them separately
- [ ] This is a user interface / front-end modification
- [ ] I have tested my changes in multiple modern browsers
- [ ] The component(s) render as expected
Smoketest:
- [x] Send notifications about my Pull Request position in Smoketest queue.
- [ ] Test my draft Pull Request.
Testing:
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-28757
Jirabot Action Result: Workflow Transition: Merge Pending Updated PR
@shamser - looks fine, but see comment re. offset_t's in CFileSizeTracker
Sorry, forgot to push the last batch of changes.
Re. naming, I'd go renaming the 3 stats to:
StSizePeakInterSubgraphDisk StSizePeakInternalTempDisk StSizePeakEphemeralDisk
@ghalliday ? And what do StSizeSpillFile and StSizeGraphSpill become? (I'm not quite sure what they currently represent).
You want them to be consistent with the other names. Some more suggestions
StSizeInterSubgraphSpill[Write?] StSizePeakInterSubgraphSpill StSizePeakInternalTemp - peak size StSizeInternalTemp - current size StSizeInternalTempWrite - total size StSizePeakEphemeralDiskUsage
It is picky, but I think it will help us to have some clear terminology for the different aspects.
StSizeInternalTemp
@ghalliday If I have understood correctly, shouldn't StSizeInternalTemp be zero at the end of each activity?
StSizeInternalTemp
@ghalliday If I have understood correctly, shouldn't StSizeInternalTemp be zero at the end of each activity?
Yes. I'm not sure it is useful - but as Jake says, it might be useful as the job is running. I suspect as long as you have the peak the active doesn't provide much value.
@shamser - please squash.