WangJiaxin-x

Results 2 issues of WangJiaxin-x

Hi, When I use `partition_type(file=io.BytesIO(file.file.read()),languages=["chi_sim"])` to parse Chinese pdf documents, I found the result was to split the paragraph text into a line text as a elemet. And another problem...

bug
needs follow up

Hi, In your code `window_step`=3 and `scale_factor`=3. I don't think it's a coincide, I think these value must be equal. Otherwise `lower_bound` and `upper bound` of 3 small chunks within...