ambuda icon indicating copy to clipboard operation
ambuda copied to clipboard

Document Ambuda data model, for developers

Open shreevatsa opened this issue 2 years ago • 8 comments

There is good documentation of the architecture at https://ambuda.readthedocs.io/en/latest/architecture.html but it would be nice to have even more: in particular, some examples of

  • how the data is stored in the db,
  • what format text needs to be in, to be ingested into the db,
  • how the mapping from text to its parse data is stored.

Maybe more architecture diagrams like https://c4model.com/ etc — enough for someone to dive in and have a good idea of how the codebase works and where to make a certain change.

(I realize this is stated very vaguely and broadly; it can never be complete. Was planning to something along these lines at some point, but would be great if someone does it… just tracking here.)

shreevatsa avatar Sep 02 '22 05:09 shreevatsa

Hey that's a great idea, I was pondering about the same. Here is database objects and their relationships:

objects_relationships

Happy to generate PlantUML diagrams but I don't yet know enough about the backend architecture.

We need to think about how to update these diagrams when there is some underlying change in the model. Otherwise, the reader might get super confused. I was thinking of adding these as part of quickstart guide for devs (with some warnings about staleness). Would love to hear other thoughts.

thapakrish avatar Sep 05 '22 18:09 thapakrish

@thapakrish This is great, thank you! How did you generate the above diagrams?

shreevatsa avatar Sep 06 '22 02:09 shreevatsa

@shreevatsa used DBVisualizer tool. Would be nice to be able to generate such diagrams via some script.

thapakrish avatar Sep 06 '22 23:09 thapakrish

I looked around and found this thing called tbls, which seems to do something adequate: tbls out -t dot | dot -T png > db.png (with some changes to its default template) gives:

image

shreevatsa avatar Sep 07 '22 15:09 shreevatsa

This is awesome! Looks like it can be wrapped in docker container and be made part of CI/CD process!

thapakrish avatar Sep 07 '22 23:09 thapakrish

Yes that may be nice to do… Do you know what's the standard way of handling "generated files" like this? (What would be really cool if the CI/CD process (GitHub actions) could generate an image (by running tbls say) and place it into the repo… but I don't think there's a way to do that?)

Meanwhile, I started writing up some ad-hoc documentation of just some tables (along the lines of the initial comment) into a doc here — please take a look and let me know what you think. cc @akprasad

shreevatsa avatar Sep 13 '22 18:09 shreevatsa

@shreevatsa wonderful -- I think we can fold a lot of this into a new doc that explains the end-to-end process of showing a verse and parse data to the user. (This would be analogous to something like Google's "Life of a query"). The data comments could be added to the corresponding models in models.py.

akprasad avatar Sep 14 '22 00:09 akprasad

Just sent https://github.com/ambuda-org/ambuda/pull/365 to add a README.md file to (mostly) each directory. We'll have to have multiple kinds of documentation (and at different levels), and IMO it's ok to err on the side of being redundant and "some documentation may be out of date" than trying to keep all documentation in just one place/format.

shreevatsa avatar Sep 22 '22 15:09 shreevatsa