software-submission icon indicating copy to clipboard operation
software-submission copied to clipboard

[Submission]: pyDHIS2 – A Modern Python SDK for DHIS2 with Async I/O, Pandas Integration, and WHO-DQR Validation

Open HzaCode opened this issue 2 months ago • 0 comments

Submitting Author: (@HzaCode) All current maintainers: (@HzaCode) Package Name: pyDHIS2 (PyPI: pydhis2) One-Line Description of Package: A modern, resilient DHIS2 Python SDK with async I/O, Pandas integration, WHO-DQR data-quality review, and reproducible CLI workflows for LMIC scenarios. Repository Link: https://github.com/HzaCode/pyDHIS2 Version submitted: 0.2.0 EiC: **@yeelauren ** Editor: TBD Reviewer 1: TBD Reviewer 2: TBD Archive: TBD JOSS DOI: TBD Version accepted: TBD Date accepted (month/day/year): TBD


Code of Conduct & Commitment to Maintain Package

  • [x] I agree to abide by [pyOpenSci's Code of Conduct][PyOpenSciCodeOfConduct].
  • [x] I have read and will commit to package maintenance as per the [pyOpenSci Policies Guidelines][Commitment].

Description

pyDHIS2 (PyPI: pydhis2) is a modern DHIS2 client SDK designed for public-health and research workflows. It provides asynchronous and synchronous clients, streaming pagination, rate-limited retries, Pandas DataFrame outputs, and CLI commands for common tasks such as analytics extraction, tracker events management, and WHO-DQR data-quality review with HTML/JSON reporting. The package also includes cookiecutter-style project templates for reproducible pipelines (configure → fetch → validate → export).


Scope

Selected categories

  • [x] Data retrieval
  • [x] Data extraction
  • [x] Data processing/munging
  • [x] Data validation and testing
  • [x] Workflow automation
  • [x] Database interoperability
  • [ ] Data deposition
  • [ ] Data visualization
  • [ ] Citation management and bibliometrics
  • [ ] Scientific software wrappers

Explanation

  • Target audience & applications: Public-health informatics teams, ministries of health, NGO/donor M&E analysts, and researchers who use DHIS2 for analytics, event tracking, or national indicator pipelines. The SDK enables reproducible data extraction, validation, and reporting for large-scale scientific and operational workflows.
  • Related packages & differentiation: Existing Python libraries like dhis2.py are thin HTTP wrappers; R packages like khisr or PEPFAR’s datimutils focus on tidyverse-style analytics. pyDHIS2 extends far beyond, offering async/streaming I/O, structured DataFrame outputs, a command-line interface, and built-in WHO-DQR checks for end-to-end reproducibility.
  • Pre-submission inquiry: None yet (to be opened if required).

Domain Specific

  • [ ] Geospatial
  • [ ] Education

Community Partnerships

  • [ ] Astropy
  • [ ] Pangeo

Category justification (1–2 sentences each)

  • Data retrieval / extraction: Provides typed clients for DHIS2 endpoints (Analytics, DataValueSets, Tracker events, Metadata) with pagination and resilient retries.
  • Data processing/munging: Outputs Pandas DataFrames and supports efficient export to Parquet, CSV, and other analysis formats.
  • Data validation and testing: Implements WHO-DQR data-quality evaluation as reusable CLI tools generating structured JSON and HTML summaries.
  • Workflow automation: CLI and project templates orchestrate complete data pipelines for periodic DHIS2 synchronization and validation.
  • Database interoperability: Offers read/write helpers to manage data-value and tracker payloads, ensuring interoperability with national DHIS2 databases.

Technical checks

This package:

  • [x] does not violate the Terms of Service of any service it interacts with.
  • [x] uses an OSI-approved license (Apache-2.0).
  • [x] contains a README with install and quick-start instructions.
  • [x] includes documentation and code examples for all main functions.
  • [x] provides a tutorial demonstrating essential features.
  • [x] has a comprehensive test suite.
  • [x] implements continuous integration via GitHub Actions.

Publication Options

  • [ ] Do you wish to automatically submit to the Journal of Open Source Software (JOSS)?
JOSS Checks (only if above is selected)
  • [ ] The package has an obvious research application.
  • [ ] The package is not a minor utility or thin client.
  • [ ] The repository contains a paper.md following JOSS requirements.
  • [ ] The package is archived with a DOI (e.g., Zenodo).

Reviewer interaction preference

  • [x] Yes, I am OK with reviewers opening issues and pull requests in my repository directly.

Confirm:

  • [x] I have read the [author guide](https://www.pyopensci.org/software-peer-review/how-to/author-guide.html).
  • [x] I expect to maintain this package for at least 2 years and will ensure continuity of maintenance.

HzaCode avatar Oct 21 '25 01:10 HzaCode