Andy Friedman
Andy Friedman
thanks for all this my big question is do we want to make calculating tolerances dynamic? like right now my approach is basically just using the size of first character...
are we sure we need `y_tolerance_ratio`? off top it feels like line spacing is much less dependent on font size... I'm going to implement x_tolerance first and we can go...
im working on it
can you provide the pdf?
Took a look at this at the 2024 Hackathon. There are ~35M rows in the data, going back to 2010. For the sake of keeping the data as fresh as...