Jeremy Singer-Vine

Results 105 comments of Jeremy Singer-Vine

Hi @pseudomonas, and thanks for the intriguing suggestion. Do you have any interest in developing a PR for this feature? If so, I'd be happy to discuss a general strategy...

Thanks, @pseudomonas! Given the niche-ness of this feature, I'm reluctant to add another *required* dependency, but I could see adding an optional dependency for this — something like: ```python def...

@afriedman412 I'm not aware of anyone actively working on this, thanks for checking — and thanks for offering! Would be wonderful if you took a crack at it.

@afriedman412 How about something like this?: [issue-987-test.pdf](https://github.com/jsvine/pdfplumber/files/13170962/issue-987-test.pdf) ```python import pdfplumber pdf = pdfplumber.open("issue-987-test.pdf") page = pdf.pages[0] for x in [ 0, 3, 10 ]: print(f"--- x_tolerance = {x} ---") print(page.extract_text(x_tolerance=x))...

Ah, my apologies for not being more explicit. Ideally, the proportional tolerance feature would make it possible to get this back: ``` Big Text Small Text ``` The examples above...

Are you looking at `char` objects, or something else (which I might infer from the variable being named `text`)? If `char` objects: - `char["height"]` already gets you that calculation (without...

Thanks, @afriedman412! I'll address your specific questions below, but first this seems like a good opportunity for me to sketch out a bit more about how I see this working:...

> my big question is do we want to make calculating tolerances dynamic? > like right now my approach is basically just using the size of first character available to...

I think that's a reasonable (and smartly constrained) place to start. I think `y_tolerance_ratio` would still be helpful, but `x_tolerance_ratio` would certainly still be useful on its own.

I don't think `waybackpack` currently supports this, but would be open to a PR that adds it. One tricky bit might be defining a criteria for "successful", particularly if the...