unilm
unilm copied to clipboard
Inference script for DiT text detection
I am using DiT for text detection and am having difficulty finding a way to perform inference on my documents. Has anyone successfully created an inference script for this model in this task? Valentin
I am using it in my project here BoxProcessorUlimDit
Example:
from marie.boxes import BoxProcessorUlimDit
from marie.boxes.box_processor import PSMode
box = BoxProcessorUlimDit(
models_dir="../../model_zoo/unilm/dit/text_detection",
cuda=True,
)
(
boxes,
fragments,
lines,
_,
lines_bboxes,
) = box.extract_bounding_boxes("gradio", "field", image, PSMode.SPARSE)
bboxes_img = visualize_bboxes(image, boxes, format="xywh")
lines_img = visualize_bboxes(image, lines_bboxes, format="xywh")