unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

chore: clean up partition params

Open Coniferish opened this issue 3 months ago • 1 comments

Best to review commit by commit.

This PR is the first for cleaning up the partition params. Fixes in this PR:

  • Move non-partitioner modules to unstructured/partition/utils/
    • Note that json.py was not moved, but it could be if deemed appropriate.
  • Remove chunking_strategy param from all partitioners since it is accessed through **kwargs in all partitioners
  • Add * to partitioner params.

Coniferish avatar May 07 '24 16:05 Coniferish

Is there something that needs to be fixed by this PR? What would it enable in the future if it is not fixing anything?

As it stands, imo this harms the readability of the code base. 1/ the * pattern is just weird 2/ a reader of the source code of a partition_ function is less clued into the fact there is this critical chunking_strategy param available

cragwolfe avatar May 07 '24 17:05 cragwolfe