crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: markdown property is string instead of DefaultMarkdownGenerator

Open BrunoCQ opened this issue 7 months ago • 0 comments

crawl4ai version

0.6.0

Expected Behavior

markdown should contain the fit_markdown property

Current Behavior

I've recently upgraded to 0.6.0 from 0.4.xx and noticed that the markdown_v2 was deprecated and that we should use markdown instead. However, the result.markdown is text instead of DefaultMarkdownGenerator which contains the fit_markdown property that I was using.

Is this reproducible?

Yes

Inputs Causing the Bug

run_config =  CrawlerRunConfig(
                cache_mode=CacheMode.BYPASS,
                #deep_crawl_strategy=BFSDeepCrawlStrategy (
                #    max_depth=2,
                #    include_external=False,
                #),
                scraping_strategy=LXMLWebScrapingStrategy(),
                process_iframes=True,
                markdown_generator=DefaultMarkdownGenerator(
                    content_filter=PruningContentFilter(
                        threshold=0.5, threshold_type="fixed"
                    )
                ),
                remove_overlay_elements=True,
                verbose=True,
                locale="en-AU",
                timezone_id="Australia/Sydney",
                geolocation=GeolocationConfig(
                    latitude=-33.867,
                    longitude=151.200,
                    accuracy=10.0
                )
            )

Steps to Reproduce


Code snippets

run_config =  CrawlerRunConfig(
                cache_mode=CacheMode.BYPASS,
                #deep_crawl_strategy=BFSDeepCrawlStrategy (
                #    max_depth=2,
                #    include_external=False,
                #),
                scraping_strategy=LXMLWebScrapingStrategy(),
                process_iframes=True,
                markdown_generator=DefaultMarkdownGenerator(
                    content_filter=PruningContentFilter(
                        threshold=0.5, threshold_type="fixed"
                    )
                ),
                remove_overlay_elements=True,
                verbose=True,
                locale="en-AU",
                timezone_id="Australia/Sydney",
                geolocation=GeolocationConfig(
                    latitude=-33.867,
                    longitude=151.200,
                    accuracy=10.0
                )
            )

OS

Windows

Python version

3.12

Browser

No response

Browser version

No response

Error logs & Screenshots (if applicable)

No response

BrunoCQ avatar Apr 23 '25 21:04 BrunoCQ