haystack icon indicating copy to clipboard operation
haystack copied to clipboard

feat: add keep_id parameter to DocumentCleaner

Open CarlosFerLo opened this issue 9 months ago • 1 comments

Related Issues

  • fixes #7557

Proposed Changes:

The DocumentCleaner has now an optional property called keep_id that keeps the original id of all the input documents.

How did you test it?

Added one extra unit test and edited the one that checks correct initialisation of the object.

Notes for the reviewer

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes ✅
  • I added unit tests and updated the docstrings ✅
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test:. ✅
  • I documented my code ✅
  • I ran pre-commit hooks and fixed any issue ✅

CarlosFerLo avatar Apr 29 '24 18:04 CarlosFerLo

Pull Request Test Coverage Report for Build 9004490329

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.004%) to 90.41%

Files with Coverage Reduction New Missed Lines %
components/preprocessors/document_cleaner.py 1 98.85%
<!-- Total: 1
Totals Coverage Status
Change from base Build 9004274756: 0.004%
Covered Lines: 6524
Relevant Lines: 7216

💛 - Coveralls

coveralls avatar Apr 29 '24 18:04 coveralls

Sorry for the delay @CarlosFerLo - I'm taking a look at this one now

vblagoje avatar May 08 '24 12:05 vblagoje