geomancer icon indicating copy to clipboard operation
geomancer copied to clipboard

Optimize BQ uploads by reusing tables (cache)

Open ljvmiranda921 opened this issue 5 years ago • 1 comments

We upload a dataframe into BQ (as a table) for every call. This is inefficient given larger datasets. There should be a better way to:

  • Check if the dataframe in question already exists in the BigQuery dataset
  • If yes, then just get that table, else, do the upload.

ljvmiranda921 avatar Mar 03 '19 07:03 ljvmiranda921

@ljvmiranda921 https://pandas-docs.github.io/pandas-docs-travis/reference/api/pandas.util.hash_pandas_object.html

marksteve avatar Mar 20 '19 03:03 marksteve