crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

fix: tests to run under pytest

Open stevenh opened this issue 7 months ago • 12 comments
trafficstars

Fix tests to run fully under pytest, this includes:

  • Fixing filenames, removing dots and correct typos
  • Removing init methods, which are not supported by pytest
  • Implement parametrisation so tests can be run individually
  • Added timeouts so tests can't run forever
  • Replacing print and logging with assertions to prevent false successes
  • Removing unused and add missing imports
  • Mark tests with @pytest.mark.asyncio where appropriate
  • Use http constants to avoid magic numbers
  • Add type hints to improve linting and identify issues
  • Use local server for API tests to improve debugging and eliminate docker dependency
  • Call pytest in main to allow running tests from command line
  • Skip broken tests
  • Fix out of date logic and invalid method parameters
  • Re-enable disabled and commented out tests after fixing them
  • Added missing test data
  • Updated tests that depend on altered external css or html structure
  • Automatically skip if tests if API key is not set

If you need to debug a test, which will take time, you will need to comment out the default timeout in pyproject.toml under [tool.pytest.ini_options].

Checklist:

  • [x] My code follows the style guidelines of this project
  • [x] I have performed a self-review of my own code
  • [x] I have commented my code, particularly in hard-to-understand areas
  • [x] I have made corresponding changes to the documentation
  • [x] I have added/updated unit tests that prove my fix is effective or that my feature works
  • [x] New and existing unit tests pass locally with my changes

stevenh avatar Mar 26 '25 10:03 stevenh

Here's what the tests look like now in VS Code: image image

stevenh avatar Mar 27 '25 00:03 stevenh

@stevenh Thanks for this PR. This is really going to be a life saver when we do new releases. I've requested @unclecode for a quick review. We'll keep you posted.

aravindkarnam avatar Mar 28 '25 08:03 aravindkarnam

Thanks @aravindkarnam I've tried to minimise the differences over the past few days and it's now in a pretty good place.

As per this issue https://github.com/unclecode/crawl4ai/issues/893 I was going to break it out into separate PRs, which will take quite a bit of time, so if you're happy to review this as a single item that would avoid a lot of overhead which also comes with the challenge that the tests wont pass until all the fixes are merged.

I can obviously update this PR with the details of all the fixes, which would be much quicker.

Let me know the approach you would like me to take?

In the meantime I'll continue to refine, the main things left is to ensuring that all tests are using assertions as some were just doing prints.

stevenh avatar Mar 28 '25 10:03 stevenh

Oh as a follow on it would be great to have the tests trigger in CI, happy to look at adding that to. The obvious challenge with the resources needed so some tests might not be possible, but we could have them skipped automatically if GitHub actions is detected.

stevenh avatar Mar 28 '25 10:03 stevenh

These changes are now in a reviewable state, with all tests passing locally.

Let me know if you want the separate PR breakdown.

Here's an example of the output from the cli:


========================================= test session starts ==========================================
platform darwin -- Python 3.12.9, pytest-8.3.5, pluggy-1.5.0
rootdir: /Users/steve/code/github.com/unclecode/crawl4ai
configfile: pyproject.toml
plugins: cov-6.0.0, anyio-4.9.0, asyncio-0.25.3, timeout-2.3.1, pytest_httpserver-1.1.2
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=function
timeout: 20.0s
timeout method: signal
timeout func_only: True
collected 509 items

tests/20241401/test_advanced_deep_crawl.py .                                                      [  0%]
tests/20241401/test_async_crawl_with_http_crawler_strategy.py ...                                 [  0%]
tests/20241401/test_async_crawler_strategy.py ...................                                 [  4%]
tests/20241401/test_async_markdown_generator.py ................                                  [  7%]
tests/20241401/test_async_webcrawler.py ..........                                                [  9%]
tests/20241401/test_cache_context.py .                                                            [  9%]
tests/20241401/test_deep_crawl.py ..                                                              [ 10%]
tests/20241401/test_deep_crawl_filters.py ....................................................... [ 21%]
tests/20241401/test_deep_crawl_scorers.py ........................                                [ 25%]
tests/20241401/test_http_crawler_strategy.py ...........                                          [ 27%]
tests/20241401/test_llm_filter.py .                                                               [ 28%]
tests/20241401/test_robot.py ........                                                             [ 29%]
tests/20241401/test_robot_parser.py .                                                             [ 29%]
tests/20241401/test_schema_builder.py ....                                                        [ 30%]
tests/20241401/test_stream.py .                                                                   [ 30%]
tests/20241401/test_stream_dispatch.py .                                                          [ 31%]
tests/async/test_0_4_2_browser_manager.py ......                                                  [ 32%]
tests/async/test_0_4_2_config_params.py .........                                                 [ 33%]
tests/async/test_async_downloader.py .........                                                    [ 35%]
tests/async/test_basic_crawling.py .....                                                          [ 36%]
tests/async/test_caching.py ....                                                                  [ 37%]
tests/async/test_chunking_and_extraction_strategies.py ....                                       [ 38%]
tests/async/test_content_extraction.py ......                                                     [ 39%]
tests/async/test_content_filter_bm25.py ...............                                           [ 42%]
tests/async/test_content_filter_prune.py ............                                             [ 44%]
tests/async/test_content_scraper_strategy.py ..................                                   [ 48%]
tests/async/test_crawler_strategy.py .....                                                        [ 49%]
tests/async/test_database_operations.py .....                                                     [ 50%]
tests/async/test_dispatchers.py ..........                                                        [ 52%]
tests/async/test_edge_cases.py .......                                                            [ 53%]
tests/async/test_error_handling.py ..s.ss                                                         [ 54%]
tests/async/test_evaluation_scraping_methods_performance_configs.py ......................        [ 59%]
tests/async/test_markdown_genertor.py ......                                                      [ 60%]
tests/async/test_parameters_and_options.py s......                                                [ 61%]
tests/async/test_performance.py ..s                                                               [ 62%]
tests/async/test_screenshot.py .....                                                              [ 63%]
tests/cli/test_cli.py ............                                                                [ 65%]
tests/docker/test_config_object.py .                                                              [ 65%]
tests/docker/test_core.py ........                                                                [ 67%]
tests/docker/test_crawl_task.py ssssssss                                                          [ 68%]
tests/docker/test_docker.py ......                                                                [ 70%]
tests/docker/test_dockerclient.py ..                                                              [ 70%]
tests/docker/test_serialization.py ...                                                            [ 71%]
tests/docker/test_server.py ................................ssssssss....                          [ 79%]
tests/docker/test_server_token.py ......................................ssss...                   [ 88%]
tests/hub/test_simple.py .s                                                                       [ 88%]
tests/legacy/test_cli_docs.py .                                                                   [ 89%]
tests/loggers/test_logger.py .                                                                    [ 89%]
tests/test_crawl_result_container.py ..........................................                   [ 97%]
tests/test_llmtxt.py .                                                                            [ 97%]
tests/test_scraping_strategy.py .                                                                 [ 98%]
tests/test_web_crawler.py ..s.......                                                              [100%]

======================== 482 passed, 27 skipped, 1 warning in 818.78s (0:13:38) ========================

stevenh avatar Mar 31 '25 21:03 stevenh

I've re-reviewed all the changes and identified 50 potential individual PR's, most are relatively simple bug fixes with few that stand out as bit larger in scope / impact:

fix: config serialisation

Fix config serialisation by creating a new Serialisable type and adding missing module imports for ScoringStats and Logger.

This allows the config to be serialised and deserialised correctly.

Add missing initialisation for ScoringStats.

Add missing stats parameter to URLScorer and all its subclasses to ensure that the stats are serialisable.

fix: download handling

Fix the handling of file downloads in AsyncPlaywrightCrawlerStrategy which wasn't waiting for the download to complete before returning, which resulted in race conditions and incomplete or missing downloads.

fix: markdown caching

Fix the caching of markdown field in DB / files which was only storing the single value, which caused failures when using cached results.

Export the markdown field in StringCompatibleMarkdown, so we don't need to use a private field to ensure that the value is serialised correctly.

fix: crawl result handling

Fix the handling of crawl results, which were using inconsistent types. This now uses CrawlResultContainer for all crawl results, unwrapping as needed when performing deep crawls.

This moves CrawlResultContainer into models ensuring it can be imported where needed, avoiding circular imports.

Refactor CrawlResultContainer to subclass CrawlResult to provide type hinting in the single result case and ensure consistent handling of both synchronous and asynchronous results.

fix: BM25Okapi idf calculation

Fix the idf calculation in BM25Okapi to use the correct formula and ensure that the idf is calculated correctly. This prevents missing results when using BM25Okapi caused by zero idf values.

Removed commented out code to improve readability.

fix: links, media and metadata caching

Fix the storage of links, media and metadata to ensure that the correct values are stored and returned. This prevents incorrect results when using the cached results.

Use Field for default values in Media, Links and ScrapingResult pydantic models to prevent invalid results.

fix: test suite

Fix the test suite to ensure that all tests are run and validation, using asserts, is correctly performed.

Parameterise test so that individual tests can be run from either cli or IDE.

Standardise the main wrapper to allow calling directly using python including passing pytest flags.

Use local server where applicable to ensure test validation and avoid external dependencies ,such as docker, which improves test speed and the ability to debug issues.

Add type hints to improve linting validation and IDE support.

Re-enable tests which were previously disabled due to failures, which have now been fixed.

Use constants from httpx.codes for status codes to avoid magic numbers and improve comprehension.

Limit long running tests to avoid excessive run times.

All tests are now runnable using pytest.


The question is, should I split or not?

Happy to do that if that will help get all the fixes in, but obviously raising 50 individual PR's is a decent undertaking so would be great to confirm first.

stevenh avatar Apr 01 '25 12:04 stevenh

@stevenh Wow ** 10! Amazing, really appreciate such collaboration. Tbh this is a kind of support means a lot for any open source library. We really need help on testing and approaching us to a table release.

Regarding splits, I suggest we split them into 3 PRs:

  1. Core & Domain Fixes

    • Group fixes for BM25 idf calculation, config serialization, crawl result handling, and any core model refactoring (like the CrawlResultContainer update).
    • Tests to focus on here would be those in tests/async/test_content_filter_bm25.py and tests/test_crawl_result_container.py.
  2. Caching & Download Improvements

    • Separate out fixes for markdown caching and handling of downloads in AsyncPlaywrightCrawlerStrategy, plus links/media caching improvements.
    • Related tests might include tests/20241401/test_async_markdown_generator.py and tests/20241401/test_http_crawler_strategy.py.
  3. Test Suite Enhancements & CI Setup

    • Isolate changes that modernize and standardize the test suite, such as converting prints to assertions, parameterizing tests, and ensuring local server usage.
    • This can also cover Docker-specific tests (from tests/docker/*) and CLI improvements in tests/cli/test_cli.py.

Splitting this way gives meaningful, cohesive sets of changes, easing review and rollback if needed, while covering distinct areas that have minimal interdependencies.

I think this is good enough of specialisations, more than this is unnecessarily.

Again thx a million and appreciate it. We are building a small group of collaborators, to be part of the main circle I am trying to build for Crawl4ai, you are welcome to join. Let me know if this is something you are interested, then we will talk more about it. Anyway happy to have members like you in our community.

unclecode avatar Apr 08 '25 15:04 unclecode

Thanks @unclecode I'll look to get the breakdown done soon, as there's already some conflicts creeping in.

Would be happy to join the team, if you'll have me, so lets sync on that when you have some time.

stevenh avatar Apr 08 '25 15:04 stevenh

The first of the three PR is up: https://github.com/unclecode/crawl4ai/pull/969.

I ended up expanding the scope to include few other dependent fixes, in particular the one to the browser manager, which was causing test failures due to use of a previously closed browser, see comments on that PR.

stevenh avatar Apr 10 '25 14:04 stevenh

All PR's are now ready @unclecode, they are based off each other so need to be reviewed and merged in order.

  1. https://github.com/unclecode/crawl4ai/pull/969
  2. https://github.com/unclecode/crawl4ai/pull/970
  3. https://github.com/unclecode/crawl4ai/pull/891

I've left the second two in draft in case changes are needed on the first.

The second two are individual commits, so if you want to get a sense for them you can just look at the last commit on each.

stevenh avatar Apr 11 '25 15:04 stevenh

Latest set of results:

platform darwin -- Python 3.12.9, pytest-8.3.5, pluggy-1.5.0
rootdir: /Users/steve/code/github.com/unclecode/crawl4ai
configfile: pyproject.toml
plugins: cov-6.0.0, anyio-4.9.0, asyncio-0.25.3, timeout-2.3.1, pytest_httpserver-1.1.2, aiohttp-1.1.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=function
collected 552 items

tests/20241401/test_advanced_deep_crawl.py .                             [  0%]
tests/20241401/test_async_crawl_with_http_crawler_strategy.py ...        [  0%]
tests/20241401/test_async_crawler_strategy.py ...................        [  4%]
tests/20241401/test_async_markdown_generator.py ................         [  7%]
tests/20241401/test_async_webcrawler.py ..........                       [  8%]
tests/20241401/test_cache_context.py .                                   [  9%]
tests/20241401/test_deep_crawl.py ..                                     [  9%]
tests/20241401/test_deep_crawl_filters.py .............................. [ 14%]
.........................                                                [ 19%]
tests/20241401/test_deep_crawl_scorers.py .......................        [ 23%]
tests/20241401/test_http_crawler_strategy.py ...........                 [ 25%]
tests/20241401/test_llm_filter.py .                                      [ 25%]
tests/20241401/test_robot.py ........                                    [ 27%]
tests/20241401/test_robot_parser.py .                                    [ 27%]
tests/20241401/test_schema_builder.py .s.s                               [ 28%]
tests/20241401/test_stream.py .                                          [ 28%]
tests/20241401/test_stream_dispatch.py .                                 [ 28%]
tests/async/test_0_4_2_browser_manager.py ......                         [ 29%]
tests/async/test_0_4_2_config_params.py .........                        [ 31%]
tests/async/test_async_downloader.py .........                           [ 32%]
tests/async/test_basic_crawling.py .....                                 [ 33%]
tests/async/test_caching.py ....                                         [ 34%]
tests/async/test_chunking_and_extraction_strategies.py ....              [ 35%]
tests/async/test_content_extraction.py ......                            [ 36%]
tests/async/test_content_filter_bm25.py ...............                  [ 38%]
tests/async/test_content_filter_prune.py ............                    [ 41%]
tests/async/test_content_scraper_strategy.py ..................          [ 44%]
tests/async/test_crawler_strategy.py .....                               [ 45%]
tests/async/test_database_operations.py .....                            [ 46%]
tests/async/test_dispatchers.py ....s....s                               [ 48%]
tests/async/test_edge_cases.py .......                                   [ 49%]
tests/async/test_error_handling.py ..s.ss                                [ 50%]
tests/async/test_evaluation_scraping_methods_performance_configs.py .... [ 51%]
..................                                                       [ 54%]
tests/async/test_markdown_genertor.py ......                             [ 55%]
tests/async/test_parameters_and_options.py s......                       [ 56%]
tests/async/test_performance.py ..s                                      [ 57%]
tests/async/test_screenshot.py .....                                     [ 58%]
tests/browser/docker/test_docker_browser.py ..s..sss....s                [ 60%]
tests/browser/test_browser_manager.py ....                               [ 61%]
tests/browser/test_builtin_browser.py ..........                         [ 63%]
tests/browser/test_builtin_strategy.py ..                                [ 63%]
tests/browser/test_cdp_strategy.py ...                                   [ 63%]
tests/browser/test_launch_standalone.py s                                [ 64%]
tests/browser/test_parallel_crawling.py ...                              [ 64%]
tests/browser/test_playwright_strategy.py ....                           [ 65%]
tests/browser/test_profiles.py ..                                        [ 65%]
tests/cli/test_cli.py ............                                       [ 67%]
tests/docker/test_config_object.py .                                     [ 68%]
tests/docker/test_core.py ........                                       [ 69%]
tests/docker/test_crawl_task.py ssssssss                                 [ 71%]
tests/docker/test_docker.py ......                                       [ 72%]
tests/docker/test_dockerclient.py ..                                     [ 72%]
tests/docker/test_serialization.py ...                                   [ 73%]
tests/docker/test_server.py ................................ssssssss.... [ 80%]
                                                                         [ 80%]
tests/docker/test_server_token.py ...................................... [ 87%]
ssss...                                                                  [ 89%]
tests/hub/test_simple.py .s                                              [ 89%]
tests/legacy/test_cli_docs.py .                                          [ 89%]
tests/loggers/test_logger.py .                                           [ 89%]
tests/memory/test_crawler_monitor.py .                                   [ 90%]
tests/memory/test_dispatcher_stress.py .                                 [ 90%]
tests/test_crawl_result_container.py ................................... [ 96%]
.......                                                                  [ 97%]
tests/test_llmtxt.py .                                                   [ 98%]
tests/test_scraping_strategy.py .                                        [ 98%]
tests/test_web_crawler.py ..s.......                                     [100%]

================= 515 passed, 37 skipped in 942.35s (0:15:42) ==================
Finished running tests!

stevenh avatar Apr 11 '25 23:04 stevenh

@aravindkarnam @unclecode I see there's been quite a few changes merged recently which have caused a large number of conflicts. The last rebase took a large amount of time, so just checking in to see what the next steps might be?

stevenh avatar Apr 23 '25 10:04 stevenh

Closing as never got any traction, so we've moved away from crawl4ai.

If someone wants to pick up the branch and reuse, feel free.

stevenh avatar Aug 18 '25 10:08 stevenh