superduper
superduper copied to clipboard
[MISC] Check that `pandas` can be used to connect to multiple tables
Some extra definition:
With superduper
we should connect like this:
db = superduper('parent_directory/*.csv')
This means that if we need output tables, these should be saved as 'parent_directory/<name-of-output-table>.csv'
.
We will need BytesEncoding.base64
everywhere, and we should somehow save the output table after every computation.
We should restrict this so that it does not work in cluster mode.
For example:
db = superduper(['customers.csv', 'orders.csv'])
table = Table('orders')
db.execute(table.filter(table.brand == 'Nike'))
As part of [TEST-USE] Transfer learning #1967 I am trying this
from superduperdb import superduper
db = superduper(['sample.xlsx'], metadata_store=f'mongomock://meta')
and I am getting this error
ValueError: Couldn't auto-identify ['sample.xlsx'], please wrap explicitly using ``superduperdb.components.*``
Any inputs will be appreciated, thank you.