David Gasquez

Results 83 issues of David Gasquez

1. Fully static 2. DuckDB WASM + Static Parquet file

Random thoughts around decentralized and permissionless data lakes. - An easy target is blockchain data. - Everything should be content adressed and inmutable! Easy to get with chain data. I...

question

- How do you add a new source? - How do you run a dbt model? - What is the development workflow? - How does Dagster and dbt work?

documentation
enhancement

Just [follow the docs](https://docs.dagster.io/integrations/dbt)!

enhancement

A great example is [Parity live quering](https://dashboards.data.paritytech.io/query/live-query.html).

enhancement

Datadex will produce static files. A way to improve UX would be to create a static website that showcase the files. Thinking further down, perhaps something like Astro running get...

We should publish datasets in multiple places - [ ] Parquet on S3/R2 - [x] [Frictionless Packages](https://github.com/davidgasquez/datadex/issues/40) - [ ] DuckDB file on GitHub Releases - [x] [HuggingFace](https://github.com/davidgasquez/datadex/issues/41) - [...

- [ ] Bytewax. [Example](https://bytewax.io/guides/real-time-financial-exchange-order-book-application). - [ ] [Dozer](https://getdozer.io/) to expose the datasets. - [ ] [Recap](https://github.com/recap-cloud/recap). - [ ] [Cube](https://cube.dev/). - [ ] [Clickhouse](https://medium.com/datadenys/working-with-s3-files-directly-from-clickhouse-7db330af7875). [Another](https://medium.com/datadenys/scaling-clickhouse-using-amazon-s3-as-a-storage-94a9b9f2e6c7). - [ ] Metabase...

enhancement

- https://github.com/dagster-io/hooli-data-eng-pipelines - https://github.com/dagster-io/dagster/tree/master/examples/quickstart_gcp - https://github.com/dagster-io/dagster/tree/master/examples/quickstart_etl - https://github.com/dagster-io/dagster/tree/master/examples/assets_dbt_python - https://github.com/zsvoboda/ngods-stocks/tree/main - https://github.com/stkbailey/dagster-jaffle-shop - https://github.com/jonathanneo/my-dbt-dagster - https://github.com/airbytehq/open-data-stack - https://airbyte.com/blog/building-airbytes-data-stack - https://github.com/westmarindata/dagster-integration-demo - https://github.com/b-gar/dagster-cfb - https://github.com/jonathanneo/data-aware-orchestration - https://github.com/mitodl/ol-data-platform - https://github.com/fremantle-industries/tabletop - https://github.com/catalyst-cooperative/pudl...

documentation

Seems there is no Dagster for streaming. Something that makes it easy¹ to link both worlds. ¹ There is Apache Beam, but not super easy.

question