Raymond Cheng comments

Results 135 comments of


                                            Raymond Cheng

Try Dagster embedded-elt for database replication

https://github.com/opensource-observer/oso/pull/1733 Done

Version the mart models

I renamed a bunch today. The rest I want to discuss with the team to figure out if we should delete them, move them to intermediate stage, or refactor them...

This is done mostly. Related PRs include https://github.com/opensource-observer/oso/pull/1359 https://github.com/opensource-observer/oso/pull/1355 https://github.com/opensource-observer/oso/pull/1344 https://github.com/opensource-observer/oso/pull/1339 https://github.com/opensource-observer/oso/pull/1340 https://github.com/opensource-observer/oso/pull/1337 We still need to refactor the event aggregations, but let's use this issue for that https://github.com/opensource-observer/oso/issues/1317

Trino for distributed queries

Looking at the docs, this is pretty interesting, you can setup data connectors to run - Queries over BigQuery storage API - Forwarding queries to a Clickhouse or Snowflake instance...

Trino for distributed queries

Apparently you can run Trino on GCP DataProc! that surprised me https://cloud.google.com/dataproc/docs/tutorials/trino-dataproc

Trino for distributed queries

For reference, dbt-trino is useful if we want to replace BQ in our data pipeline https://github.com/starburstdata/dbt-trino

Populate OSS Directory with description fields

Fair enough, probably makes sense to join it in a dbt model after importOssDirectory into the projects intermediate model

Populate OSS Directory with description fields

Starting with enabling an optional `description` field in the project or collection files in oss-directory https://github.com/opensource-observer/oss-directory/pull/274

Populate OSS Directory with description fields

I think the cloudquery plugin needs to be updated as well. I wonder if we can just have the cloudquery plugin use the JSON schema directly, rather than duck type...

Populate OSS Directory with description fields

importOssDirectory from cloudquery grabs dsecription here https://github.com/opensource-observer/oso/pull/1360/files