oso icon indicating copy to clipboard operation
oso copied to clipboard

Reorganize Python Modules

Open ravenac95 opened this issue 11 months ago • 6 comments

What is it?

This is not a fully fleshed out issue but in general this is a set of problems that we need to address at some point.

Lots of the development of things in the oso repo happend this way:

  1. Prototype
  2. Ship to prod
  3. Move to the next thing

While this has been generally great for us in terms of Getting Shit Done™️, it has introduced some unwanted technical debt. It also makes things a bit awful when trying to figure out where to put new code.

Here are some of the current issues with this:

  • Currently the files that make up the oso cli are all over the place and it doesn't make sense where that CLI lives.
    • Bonus: Should this also just live under pnpm? I originally avoided this because external data scientists using the library would have to install unneeded node things to use the python part of the library.
  • We have oso_dagster, metrics_tools, and opsscripts as high level modules now. However, many of these share things between each other. This is not great
  • Traditionally, python tests are stored in separate directories from the modules they're testing, I tried to do the golang thing of putting them in the same place. I'm curious if others have thoughts on this.

Some initial thoughts:

  • We use a lot of sqlglot related tools in many places. We should put those in one place
  • The "metrics" tools might just be extensions of that sqlglot tooling
  • oso_dagster might be fine where it is, but submodules inside it that are not dagster specific should likely be put somewhere more common.

ravenac95 avatar Jan 24 '25 19:01 ravenac95

After a discussion with @IcaroG and @Jabolol this is the proposed folder structure for what we'd like to see:

lib/
  oso_core/ # Much of metrics_tools should go here 
            # except the metrics calculation service
            # Also this should have any other common lib/tools
  pyoso/
  cli/ # the oso_lets_go cli and ops 
       # related cli interfaces should go here
warehouse/
  oso_dagster/ # We should move general things (
               # not dagster specific) to oso_core
  oso_sqlmesh/
  metrics_service/

ravenac95 avatar Feb 26 '25 06:02 ravenac95

Wanted to just get this done but realize it will break some things in sqlmesh so for now we will slowly migrate things.

ravenac95 avatar May 21 '25 20:05 ravenac95

@ravenac95 is this still relevant?

ryscheng avatar Jun 13 '25 19:06 ryscheng

@ryscheng some of it is still relevant. It hasn't been completely closed. Let me do some work to enumerate the remaining work.

ravenac95 avatar Jun 24 '25 16:06 ravenac95

One other thing that @ccerv1 just brought up is that the seed data is in a confusing place. We should also reorganize this.

ravenac95 avatar Jul 07 '25 22:07 ravenac95

So remaining things to do here:

  • We should turn all python packages into their own python projects (e.g. things should have their own pyproject.toml files)
    • [ ] metrics_service
    • [ ] oso_sqlmesh
    • [ ] oso_dagster
  • We should ensure that the correct references exist between projects
  • Some additional work to maybe move things into oso-core

ravenac95 avatar Oct 15 '25 22:10 ravenac95