PdfPig icon indicating copy to clipboard operation
PdfPig copied to clipboard

Allow reading orders dectors to support any class that has a bounding box/PdfRectangle

Open davebrokit opened this issue 1 year ago • 1 comments

Currently the interface IReadingOrderDetector relies on TextBlock as a parameter. This limits it's use to the TextBlock class.

I propose adding an IBoundingBox interface

public interface IBoundingBox
{
    PdfRectangle BoundingBox { get; }
}

Then changing IReadingOrderDector interface and implementing classes to use IBoundingBox as it's parameter

Adding an overload that takes a Func<T, PdfRectangle> would allow the caller to specify any bounding box making the interface more useful.

Breaking changes: The IReadingOrderDector will instead return an IReadOnlyList<T> which will be the ordered results. This would mean TextBlock.ReadingOrder is not set which is a breaking change. But some code can be added that if type T is TextBlock then ReadingOrder is set

Happy to make the changes

davebrokit avatar Jun 25 '24 11:06 davebrokit

@davebrokit I was thinking of doing similar, please go ahead and implement your idea.

I did a similar interface for my project https://github.com/BobLd/Caly/blob/master/Caly.Pdf/Models/IPdfTextElement.cs feel free to reuse that or not.

I think the Letter class has a method instead of a property to get the bounding box. Might be a good opportunity to change that too (in my mind, the letters, text lines and text block should implement your interface, but please let me know what you think)

BobLd avatar Jun 25 '24 11:06 BobLd

Closing in line with #1095 will try and work out what the state of the PR is

EliotJones avatar Jul 20 '25 01:07 EliotJones