unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

bug/<Only element from partition is "Please enable JS and disable any ad blocker">

Open StatsAI opened this issue 4 months ago • 0 comments

Describe the bug Only element returned from partition is (unstructured.documents.html.HTMLTitle, 'Please enable JS and disable any ad blocker')

To Reproduce

!pip install "unstructured[all-docs]"

url = 'https://www.nytimes.com/2024/02/19/world/europe/navalny-letters-russia.html'

from unstructured.partition.auto import partition
elements = partition(url=url, strategy='hi_res', html_assemble_articles=True)

display(*[(type(element), element.text) for element in elements])

Expected behavior Partition results (Title, Narrative Text, etc) should be returned

Environment Info Google Colab

StatsAI avatar Feb 20 '24 06:02 StatsAI