core icon indicating copy to clipboard operation
core copied to clipboard

Refactor network module - prepare it for easier sub-module testing

Open MehmedGIT opened this issue 1 year ago • 1 comments

Please feel free to suggest more changes that would lead to easier testing or improve the readability of the code. The Scrutinizer analysis already reports some good improvements so far.

Implemented refactoring changes:

  • Makefile extension to support ocrd network test commands: network-module-test, network-integration-test
  • Harmonize code - spacing between method parameters to maintain readability while still reducing the lines of the files. More variable typing wherever it was missing. Extensive use of " instead of ' for strings to remove the need of escaping the ' symbol when the string includes '.
  • Improved error messages to include extra variable data are available for easier debugging
  • Replaced repetitive constants with extensive use of Enums such as: AgentType, DeployType, JobState, NetworkLoggingDirs, ServerApiTags
  • Env variable refactor to be generally used for all RabbitMQ clients and not just workers OCRD_NETWORK_WORKER_QUEUE_CONNECT_ATTEMPTS -> OCRD_NETWORK_RABBITMQ_CLIENT_CONNECT_ATTEMPTS
  • The default job state is now set to UNSET when not provided
  • StateEnum -> JobState for better clarification
  • All occurrences of status -> state to maintain the same variable name everywhere
  • The logging module to avoid potential name collisions: src/ocrd_network/logging.py -> src/ocrd_network/logging_utils.py
  • Combined all RabbitMQ-related data into a single dictionary variable instead of 5 separate variables (processing server and processing worker). The validators also return a dictionary of validated variables.
  • Deployer module - increased readability and testability of separate deployments.
  • Run-time data modules - created separate files for different concepts. Check here.
  • Provided config parser wrapper methods for the different run-time data (for easier testing). Check here.
  • Processing server improvements (~20% less lines):
    • Introduced raise_http_exception method to wrap repetitive lines
    • Created wrapper methods for separate concepts in big methods for easier concept testing, e.g., parse_workflow_tasks, validate_first_task_input_file_groups_existence and moved these to utils or server_utils depending on whether a server response is involved or not. Moreover, moved some other full methods which were better fitting inside utils or server_utils such as create_processing_message
    • Overall improved naming conventions for methods, e.g., push_processor_job -> validate_and_forward_job_to_network_agent where network agent is of kind AgentType (check Enums above)
    • Refactored server cache implementation + provided sync versions of the async methods to directly reuse these in the tests.
  • Refactored rabbitmq_utils:
    • Moved the create_queue method from processing worker to rabbitmq_utils module
    • Provided extra 4 wrapper methods to remove code duplication in the processing server and processing worker. Check here.
  • Extended the OcrdProcessingMessage schema to support agent_type. Check here.
  • Fixed the OcrdResultMessage schema - the schema required either path_to_mets or workspace_id. However, setting workspace_id to None was triggering an error that it's not of type string according to the schema. On the other hand, assigning an empty string to workspace_id was triggering an error that either path_to_mets or workspace_id could be set but not both. So there was no way to create a result message without a crash. By default, the fields path_to_mets or workspace_id are set to empty strings when nothing is assigned. Check here.

Implemented tests:

  • The naming conventions are test_integration_* or test_modules_* - the integration tests require access to either the MongoDB or RabbitMQ, but the module tests do not. Hence, the module tests can be executed without starting any extra services (faster for testing).
  • Refactor test_db.py -> test_integration_1_db.py and replaced magic values with constants
  • Refactor test_rabbitmq.py -> test_integration_2_rabbitmq.py and replaced magic values with constants
  • test_integration_3_server_cache_requests.py - tests of the server cache for processing requests
  • test_integration_4_processing_worker.py - tests of the processing worker
  • test_integration_5_processing_server.py - tests of the processing server
  • test_modules_logging_utils.py
  • test_modules_param_validators.py
  • test_modules_process_helpers.py - Testing invocation of pythonic and bashlib processor
  • test_modules_server_cache_pages.py - Unlike the server cache requests, the pages cache does not require database access.

Open issues:

  • The bashlib processor test is still not working and is disabled currently - _test_invoke_processor_bash() - Shall be fixed in this PR.
  • There is still the deploy_agent_native_get_pid_hack method, but now isolated away from the deployer. An eye should be kept on that in another PR.

Note: I have not merged the master branch to this PR yet, since v2.63.3 broke some mets caching. I will potentially do that before this PR can be finally merged.

Note2: The macOS tests are disabled since they fail due to a reason not related to this PR itself.

MehmedGIT avatar Feb 17 '24 05:02 MehmedGIT

Arh, I hate Github Actions' false positives such as fb38c56 and 588b017 tests! There are literally failing processing endpoint tests.

MehmedGIT avatar Feb 17 '24 09:02 MehmedGIT

For me this PR is hard to review, because this are really a lot of changes.

Understandable.

I have tried to look into all the files but especially the changes in the processing_server where hard to follow.

For the processing server specifically, it is better to check the result not the changes. That one file got a lot of modifications and had to become fewer lines (~20% less).

I run the tests though and I tried to run the changes in a self build Dockercontainer together with a pythonic and bashlib worker. This tests all passed on my machine.

Have you also tried the _test_invoke_processor_bash() test by removing the underscore? That is the one I am not sure how to run yet to test the spawning of a bashlib processor separately from a worker.

The good thing is this changes nearly only affect the network package

Yes, tried to avoid moving stuff inside other modules (shall be the last thing to do) for now since this PR is already big enough.

MehmedGIT avatar Mar 26 '24 08:03 MehmedGIT

Have you also tried the _test_invoke_processor_bash() test by removing the underscore? That is the one I am not sure how to run yet to test the spawning of a bashlib processor separately from a worker.

No, I didn't try that.

joschrew avatar Mar 27 '24 07:03 joschrew