vanna
vanna copied to clipboard
Use Vanna with CSV files
Should we make a vn.use_df function that loads data into sqlite and connects to it so that you can run Vanna on dataframes that you might have brought in via CSV or some other method?
how would it work? can you write out the usage example? so it works something like this -
df = pd.read_csv('soccer_players.csv')
vn.use_df(df)
vn.ask('how many soccer players are there')
and behind the scenes, use_df creates a new sqlite db and sets that as the dataset? would we need to name the dataset like we do for other datasets?
Actually now that I think about this, in order to fit our general pattern, this should actually be:
vn.connect_to_dataframes
so example usage would look like:
df1 = pd.read_csv('soccer_players.csv')
df2 = pd.read_csv('baseball_players.csv')
vn.set_dataset('my-sports-dataset')
vn.connect_to_dataframes(df1, df2)
vn.auto_train() # Whatever we decide to call this later.
vn.ask('how many soccer players are also baseball players?')
That way the pattern looks similar to the rest and we're just handling loading into SQLite inside connect_to_dataframes
vn.connect_to_dataframes(df1, df2)
what does this do in the background? it creates a sqlite db, puts those two dfs in there, and then connects vanna to the sqlite?
what does this do in the background? it creates a sqlite db, puts those two dfs in there, and then connects vanna to the sqlite?
That's what I'm thinking
Is this implemented yet ?
what does this do in the background? it creates a sqlite db, puts those two dfs in there, and then connects vanna to the sqlite?
That's what I'm thinking
wouldn't creating a sqlite db be an overkill?