pdr-backend icon indicating copy to clipboard operation
pdr-backend copied to clipboard

[Lake][OHCLV] Update OHCLV data_factory to use csv_data_writer

Open idiom-bytes opened this issue 1 year ago • 3 comments

Motivation

OHLCV is still writing/reading from parquet.

We may instead want to use the CSVDataStore + PersistentDataStore to save/read from. But doing this right now is hard, OHLCV code is tightly-bound w/ saving/loading/etc.

DoD

  • [ ] OHCLV Data Factory CSV/Saving/Querying code has been abstracted away and is using CSVDataStore + PersistentDataStore
  • [ ] OHCLV Data Factory and DataStore logic have been separated
  • [ ] OHCLV Data Factory is querying from PersistentDataStore
  • [ ] OHCLV Data Factory tests are up-to-date and every test is passing

### Update According to Trent's comment

  • [ ] Change the IO methods to a function, in a new module cvsutil.py
  • [ ] use them into the CSVDataStore class and make it cleaner and thinner

idiom-bytes avatar Mar 11 '24 16:03 idiom-bytes

CSV Data Store had already been integrated into duckdb-integration PR although @trentmc has provided feedback on the PR.

Trent: I'm not sure if we're able to currently merge w/ these updates before implementing feedback or not. Please let us know so we can organize this update.

idiom-bytes avatar May 13 '24 00:05 idiom-bytes

Trent: I'm not sure if we're able to currently merge w/ these updates before implementing feedback or not. Please let us know so we can organize this update.

For reference, I had two comments in the PR. I just checked. On the second comment, Mustafa gave a good response and addressed my concerns. So I "resolved" that comment in the PR.

On the first comment, it's a small thing. Easier to simply follow the suggestion now, than try to track later.

For convenience, here it is👇


Both csv and duckDB are persistent data stores.

This file is csv_data_store.py, good.

The module for duckDB should be renamed from persistent_data_store.py to duckdb_data_store.py. (Otherwise its scope overlaps with that of csv_data_store.py)

trentmc avatar May 13 '24 09:05 trentmc

Thank you for the response @trentmc, i'm starting to look at these peripheral duckdb items to sign off.

idiom-bytes avatar May 21 '24 18:05 idiom-bytes

I believe I have completed all the remaining asks related to this ticket. We'll now be focusing on merging all of this into main.

idiom-bytes avatar Jun 04 '24 14:06 idiom-bytes