Vik Paruchuri
Vik Paruchuri
Note, this will not work 100% properly until this PyTorch bugfix PR is merged - https://github.com/pytorch/pytorch/pull/97110
@pacman100 Do you have any suggestions on torch versioning? There are some conditionals in the FSDP plugin that branch on it. In this case, my patch here won't work until...
@pacman100 The PyTorch PR has now been merged. Let me know how you want me to handle versioning, and I can finalize this.
That's fine by me!
Tables are not always recognized 100% properly. Table recognition happens in `segmentation.py`, and reformatting happens in `cleaners/table.py`. I'm working on improving it.
This should be fixed in the dev branch. Need to test before merging to master.
Should be fixed in https://github.com/VikParuchuri/marker/pull/116
Some OCR engines annoyingly put spaces between characters. I think it's due to their expected character spacing heuristics. I suspect that is what is happening. I'm going to try to...
Thanks for the example. It looks like the equations aren't recognized properly. I'll test this more and see if I can fix.
Sure, those will help me improve it.