Raymond Cheng
Raymond Cheng
https://github.com/opensource-observer/oso/pull/1733 Done
I renamed a bunch today. The rest I want to discuss with the team to figure out if we should delete them, move them to intermediate stage, or refactor them...
This is done mostly. Related PRs include https://github.com/opensource-observer/oso/pull/1359 https://github.com/opensource-observer/oso/pull/1355 https://github.com/opensource-observer/oso/pull/1344 https://github.com/opensource-observer/oso/pull/1339 https://github.com/opensource-observer/oso/pull/1340 https://github.com/opensource-observer/oso/pull/1337 We still need to refactor the event aggregations, but let's use this issue for that https://github.com/opensource-observer/oso/issues/1317
Looking at the docs, this is pretty interesting, you can setup data connectors to run - Queries over BigQuery storage API - Forwarding queries to a Clickhouse or Snowflake instance...
Apparently you can run Trino on GCP DataProc! that surprised me https://cloud.google.com/dataproc/docs/tutorials/trino-dataproc
For reference, dbt-trino is useful if we want to replace BQ in our data pipeline https://github.com/starburstdata/dbt-trino
Fair enough, probably makes sense to join it in a dbt model after importOssDirectory into the projects intermediate model
Starting with enabling an optional `description` field in the project or collection files in oss-directory https://github.com/opensource-observer/oss-directory/pull/274
I think the cloudquery plugin needs to be updated as well. I wonder if we can just have the cloudquery plugin use the JSON schema directly, rather than duck type...
importOssDirectory from cloudquery grabs dsecription here https://github.com/opensource-observer/oso/pull/1360/files