William Palin

Results 481 comments of William Palin

Focus on newlines and paragraphs - make this once unreadable or atleast difficult to parse into a beauty. We can now clearly see (and hopefully parse) out the block text...

Furthermore - we now should be able to reprocess these and identify relatively confidently footnotes - based on the length of the line. They stand out as smaller font because...

Lastly we implemented square white boxes where necessary when we think something is not an artifact but we cant get above 10% confidence for what it is. In this case...

@mlissner here are some of the improvements for you.

one more push coming momentarily with the last few changes

Lastly, I was able to finish implementing a smoothing out of the case caption lines on the first page which I think provides some professional looking OCR. It only works...

> Cool. I made a few comments, but none that I think is too crazy. My one remaining doubt is what the output looks like compared to the old output....

I heavily simplified the code and created a NEW pr for it. or am - so im closing this PR

I'm actively working on the tests here in courts-db as im putting in some medium sized changes and thinking about how it should work and could work better.

@bbernicker are you in our slack channel yet?