amazon-textract-textractor icon indicating copy to clipboard operation
amazon-textract-textractor copied to clipboard

[Doc] Documentation of Linearizable and their methods e.g, get_text(config)

Open oonisim opened this issue 1 year ago • 1 comments

Document class has get_text(config: TextLinearizationConfig) method as in the example Using Layout Analysis for Text Linearization cell 19.

from textractor.data.text_linearization_config import TextLinearizationConfig

config = TextLinearizationConfig(
    hide_figure_layout=True,
    title_prefix="# ",
    section_header_prefix="## "
)
print(document.get_text(config=config))    # <--- get_text() method

However, it looks the documentation only has get_text_and_words method but it does not have get_text which is the method of the parent class Linearizable(ABC): .

It would be desirable to have a clear definition and explanation of what Linearizable and what methods it has, it is being used in the sample codes, rather than going through the github code to verify what it is.

oonisim avatar Mar 02 '24 07:03 oonisim