Shahrokh Daijavad
Shahrokh Daijavad
Hi, @1337stn. Thank you for your contribution. Now that you have added the JSON output option, can I ask you one (actually two!) favor(s) for the sake of the completeness...
@cpendus @agoyal26 I don't think #1045 which is about a tool to create distribution of quality metrics is overlapping with this. In fact, #1045 can use the outcome of the...
Thanks, @cpendus. OK, now I have read the README carefully and I understand all the columns that are being added. Your notebook makes it even clearer! Question about the language...
@touma-I I cleaned up the notebook a little bit and tested it again after the clean-up. I am ready to approve if you want to move it from the Draft.
@swith005: I tested this using an internal version of Numinamath data, that I added to the "INPUT_FOLDER". The external dataset (internal version is a subset) from which the internal version...
@ShiroYasha18 I just went through this issue and understood what this is about and how it can be fixed (using the steps by @dolfim-ibm). The reason @touma-I assigned it to...
Thanks, @ShiroYasha18. Sounds good. What is PR #1199 that you are referring to? You must mean a different PR.
Thanks for the clarification, @ShiroYasha18 !
Hi, @ShiroYasha18. Sorry that I haven't responded so far. Now that pdf2parquet => docling2parquet transition has completed, can you please experiment with the `do_ocr` parameter set to `true` on your...
Sorry, @ShiroYasha18. I had not seen PR #1235 !