cht-core icon indicating copy to clipboard operation
cht-core copied to clipboard

Refactor to use `cht-datasource` for reading contacts by id (both with and without lineage)

Open jkuester opened this issue 10 months ago • 13 comments

Ticket Contents

Throughout much of the cht-core code-base contact (persons/places) documents are read directly from the database via the PouchDB library. Persons/places exist within a particular hierarchy (places can belong to other places, people can belong to places). Contact docs in the database will reference their associated lineage by _id, but the complete data of the lineage (e.g. the ancestor contacts) are not de-normalized into each document. However, within the application logic there is often a need for this de-normalized contact data. The shared-libs/lineage library supports some a number of ui-focused workflows around loading and structuring this contact/report data along with lineage information.

More recently, the cht-datasource library has been introduced as a general purpose library for interacting with contact/report data (instead of having to load it straight from Pouch). cht-datasource includes support for loading a contact along with their lineage data.

We need to streamline and centralize all our logic for reading contacts by id so that it goes through cht-datasource. When possible, code that is currently using shared-libs/lineage should be refactored to use cht-datasource directly. Any code logic that remains used in shared-libs/lineage (presumably because it is supporting some common ui-workflow) should be refactored to itself use cht-datasource instead of directly calling Pouch.

Goals

  • [ ] Update code that is loading contacts by id directly from Pouch to use cht-datasource to load the contact data instead
  • [ ] Refactor code that is loading contacts via shared-libs/lineage to use cht-datasource when possible (additional common logic around structuring the contact data should remain in shared-libs/lineage)
  • [ ] Update shared-libs/lineage to use cht-datasource to load contact data
  • [ ] Implement any additional necessary contact read APIs in cht-datasource

Expected Outcome

The end result is that all contact-by-id reads in cht-core will be done via the cht-datasource library and any unnecessary/unused logic is removed from shared-libs/lineage.

Implementation Details

One important note is that in addition to loading it by id, contact data can also be loaded as the result of various view queries. It is out of the scope of this issue to replace these view queries. This issue is focused on updating the code associated with loading contact data directly by id.

Product Name

Community Health Toolkit: cht-core

Organization Name

Medic

Domain

Healthcare

Tech Skills Needed

Docker, JavaScript, Mocha, Node.js, TypeScript

Organizational Mentor

@jkuester

Complexity

Low

Category

Refactoring

jkuester avatar Mar 13 '25 20:03 jkuester

@jkuester I am interested in contributing to this

@jkuester Hello, Nikhil Raj this side, I am third year undergrad from IIT BHU. I am interested to work on refactoring shared-libs/lineage and integrating cht-datasource for reading contacts.

I have experience in JS/TS, C++, Rust, React, Node.js, REST API, GraphQL, Postgres, Docker and Vitest.

Would love to connect with you and discuss about the project. Do we have discord or slack channel?

hustlernik avatar Apr 08 '25 14:04 hustlernik

@jkuester Hi I am Vaibhav Sahu , I'm very interested in contributing to this project. I’ve done with the setup of cht-conf Looking forward to your guidance! thank you!

Vaibhavsahu2810 avatar Apr 08 '25 23:04 Vaibhavsahu2810

All these issues with the C4GT Coding label are intended for the upcoming Code4GovTech program and so we currently are not looking to assign these issue to anyone. If you are interested in contributing to the CHT, please have a look at our first time contributor documentation! Also, please feel free to reach out on the forum if you have any questions. 👍

jkuester avatar Apr 09 '25 14:04 jkuester

@jkuester , I am participating in C4GT Coding challenge, please review my pr.

Aditya-PS-05 avatar Apr 12 '25 16:04 Aditya-PS-05

Hi @jkuester, I'd like to take this up! I’ve read through the ticket and I’m comfortable working on the refactor from Pouch/lineage to cht-datasource. Would love to get started — I’d appreciate being assigned. I also have solid experience with tRPC, GraphQL, gRPC, Node.js, Express, OpenApi ,and the DevOps side like Docker and Kubernetes. Thanks!

vineeth-0509 avatar Apr 13 '25 17:04 vineeth-0509

Hi @jkuester

I am Vipul Kumar Kushwaha , B.Tech , ECE , a final year student , NIT Raipur .. I have hands-on MERN stack experience — familiar with similar layered service architectures . I contributed to complex project structures (e.g., HostelMania, Admin Dashboards with CRUD + auth + role logic) based on real time problem.

I came across the issue regarding standardizing contact data access using cht-datasource, and I’d love to contribute. Currently, contact documents are fetched directly using PouchDB or through shared-libs/lineage, which leads to fragmented logic and inconsistency in handling lineage data. My approach would involve identifying all instances of direct PouchDB usage, refactoring them to use cht-datasource, and updating shared-libs/lineage to internally depend on it as well. With my background in full-stack (MERN), database integration, and modular design, I’m confident I can help streamline this. I’d be grateful for the opportunity to contribute and collaborate on this improvement .

Kushwaha-vipul avatar May 02 '25 21:05 Kushwaha-vipul

hii

shashi-sah2003 avatar Jun 03 '25 11:06 shashi-sah2003

The cht-datasource library provides two main methods for loading contacts by ID which can be used like this:

  1. Contact.v1.get() - gets basic contact data
  2. Contact.v1.getWithLineage() - gets contact data with full lineage information

Refer

We have to use the above two functionality wherever necessary in the codebase.

shashi-sah2003 avatar Jun 05 '25 15:06 shashi-sah2003

FYI, I have updated the title of this issue for more clarity. The goal here is to update all of the code in cht-core that is reading contacts by id to use the cht-datasoure library (either directly or indirectly).

jkuester avatar Jun 13 '25 20:06 jkuester

thanks @jkuester for clarifying it. I have already started working on those changes that is reading contacts by id without lineage.

shashi-sah2003 avatar Jun 14 '25 05:06 shashi-sah2003

Just for the record, I think replacing the shared-libs/lineage fetchHydratedDocs function is going to be out of scope for this issue. That function is hyper-optimized for getting the lineage for batches of docs. I did some side-by-side performance tests against the cht-datasource get-contact-with-lineage functionality. For just one contact, the cht-datasource logic was actually slightly faster, but for large batches (I tested 600 contacts) the cht-datasource was waayyyy worse (exponentially worse). This is a pretty big problem if we want to just convert calls from fetchHydratedDocs over to repeated calls to cht-datasource.

I think ultimately, if we determine the server really does need to support batch-fetching-contacts-with-lineage, we are going to have to enhance cht-datasource to support this. But, that is going to be out of scope for this issue. This also means we will not really be able to refactor the shared-libs/lineage code for the fetchHydratedDocs function to call cht-datasource (because this will just cause the same performance problem...)

jkuester avatar Jun 20 '25 18:06 jkuester

@jkuester thank you for testing this out. I have noticed fetchHydratedDocs has been used mainly wherever the shared-libs/lineage has been imported. so should we make this as subissue and proceed with other function updation in lineage library?

shashi-sah2003 avatar Jun 22 '25 13:06 shashi-sah2003

should we make this as subissue and proceed with other function updation in lineage library?

Already discussed this over on Discord, but just want to record here that fetchHydratedDocs changes are out of scope for this project (cht-datasource does not have any equivalent apis with a similar performance profile).

I have just updated the sub-issue for this ticket to reflect the latest from the design doc, so we should be good to proceed with them according to the details in each sub-ticket. 👍

jkuester avatar Jun 23 '25 20:06 jkuester

@jkuester thanks for adding the subissues. Now the task has become more efficient and understandable.

shashi-sah2003 avatar Jun 23 '25 20:06 shashi-sah2003

Done and done! Thanks @shashi-sah2003! 🚀

jkuester avatar Sep 02 '25 19:09 jkuester