seed icon indicating copy to clipboard operation
seed copied to clipboard

Save Derived Column values to PropertyState

Open perryr16 opened this issue 2 years ago • 9 comments

Background

Derived Columns are loosely calculated as: derived_column_value = derived_column.evaluate(property_state)

Issue

Derived Column values are dynamically calculated on each inventory list page load and have a few known performance issues:

  • They are calculated one column at a time and as the number of visible derived columns increases the page load time also increases
  • Dynamically calculated values prevent a user from being able to sort/filter by a derived column

Saving derived column data to the database would increase performance and allow for more advanced filtering and sorting.

A potential solution:

  • save the derived column values to a property state inside a new attribute dictionary called derived_data. This would be similar to 'extra_data' on the property_state
  • after a new derived column is saved, run a method to add derived_data to every property_state attached to a property_view
  • after a property is updated, or new data is uploaded run a method to update or refresh derived data
  • add a dropdown action on the inventory list for "refresh derived data"

perryr16 avatar Aug 04 '22 20:08 perryr16

This issue has been automatically marked as stale because it has not had recent activity within 60 days. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Oct 04 '22 02:10 github-actions[bot]

This will be addressed through #3430

isalanglois avatar Oct 18 '22 17:10 isalanglois

https://github.com/SEED-platform/seed/issues/3430 is closed but the issue remains. reopening this one

haneslinger avatar Dec 07 '22 22:12 haneslinger

A potential solution:

  • save the derived column values to a property state inside a new attribute dictionary called derived_data. This would be similar to 'extra_data' on the property_state
  • after a new derived column is saved, run a method to add derived_data to every property_state attached to a property_view
  • after a property is updated, or new data is uploaded run a method to update or refresh derived data
  • add a dropdown action on the inventory list for "refresh derived data"

I really like this idea. The fact it mirrors extra data so closely really does us a favor. I'm going to start documenting where logic needs to be added. To be updated.

Add column to derived_data:

  • On derived column creation.
  • On property state creation. Update column in derived_data:
  • when source_columns updates Delete column in derived_data:
  • when derived_column is deleted. Getting properties derived_data: Sorting on derived_data`:

haneslinger avatar Dec 19 '22 19:12 haneslinger

Some notes on use case:

  • new data imported daily
  • dozens of derived columns
  • Derived columns that reference other derived columns will NOT be allowed
  • Investigate when to perform the recalculation (at import time / data change time, or manually in the background)
  • If recalculation is not performed in real time, how do we communicate that to the user in the UI? (what visual indicator is there to signal that the derived columns are old values?)
  • There could be a 'recalculation' endpoint that could be hit programmatically outside of SEED after imports

@haneslinger will do some benchmarking on a large db to see how 'expensive' caching the fields and recalculating them will be.

kflemin avatar Mar 21 '24 20:03 kflemin

right-o, ive done some benchmarking

  • clean org
  • 3 derived columns
  • uploading portfolio-manager-sample.csv (500+ properties)
develop cached derived columns
upload file 3s 103s
map 8s ~2min
finish mapping 53s ~2min
load inventory ~2s ~2s

haneslinger avatar Mar 25 '24 18:03 haneslinger

let's await the import performance updates and revisit this issue!

kflemin avatar May 31 '24 17:05 kflemin