amazon-textract-textractor
amazon-textract-textractor copied to clipboard
[Doc] Documentation of Linearizable and their methods e.g, get_text(config)
Document class has get_text(config: TextLinearizationConfig) method as in the example Using Layout Analysis for Text Linearization cell 19.
from textractor.data.text_linearization_config import TextLinearizationConfig
config = TextLinearizationConfig(
hide_figure_layout=True,
title_prefix="# ",
section_header_prefix="## "
)
print(document.get_text(config=config)) # <--- get_text() method
However, it looks the documentation only has get_text_and_words method but it does not have get_text which is the method of the parent class Linearizable(ABC): .
It would be desirable to have a clear definition and explanation of what Linearizable and what methods it has, it is being used in the sample codes, rather than going through the github code to verify what it is.