geomancer
geomancer copied to clipboard
Optimize BQ uploads by reusing tables (cache)
We upload a dataframe into BQ (as a table) for every call. This is inefficient given larger datasets. There should be a better way to:
- Check if the dataframe in question already exists in the BigQuery dataset
- If yes, then just get that table, else, do the upload.
@ljvmiranda921 https://pandas-docs.github.io/pandas-docs-travis/reference/api/pandas.util.hash_pandas_object.html