seed Save Derived Column values to PropertyState

Background

Derived Columns are loosely calculated as: derived_column_value = derived_column.evaluate(property_state)

Issue

Derived Column values are dynamically calculated on each inventory list page load and have a few known performance issues:

They are calculated one column at a time and as the number of visible derived columns increases the page load time also increases
Dynamically calculated values prevent a user from being able to sort/filter by a derived column

Saving derived column data to the database would increase performance and allow for more advanced filtering and sorting.

A potential solution:

save the derived column values to a property state inside a new attribute dictionary called derived_data. This would be similar to 'extra_data' on the property_state
after a new derived column is saved, run a method to add derived_data to every property_state attached to a property_view
after a property is updated, or new data is uploaded run a method to update or refresh derived data
add a dropdown action on the inventory list for "refresh derived data"

Aug 04 '22 20:08 perryr16

This issue has been automatically marked as stale because it has not had recent activity within 60 days. It will be closed if no further activity occurs. Thank you for your contributions.

Oct 04 '22 02:10 github-actions[bot]

This will be addressed through #3430

Oct 18 '22 17:10 isalanglois

https://github.com/SEED-platform/seed/issues/3430 is closed but the issue remains. reopening this one

Dec 07 '22 22:12 haneslinger

A potential solution:

save the derived column values to a property state inside a new attribute dictionary called derived_data. This would be similar to 'extra_data' on the property_state

after a new derived column is saved, run a method to add derived_data to every property_state attached to a property_view

after a property is updated, or new data is uploaded run a method to update or refresh derived data

add a dropdown action on the inventory list for "refresh derived data"

I really like this idea. The fact it mirrors extra data so closely really does us a favor. I'm going to start documenting where logic needs to be added. To be updated.

Add column to derived_data:

On derived column creation.
On property state creation. Update column in derived_data:
when source_columns updates Delete column in derived_data:
when derived_column is deleted. Getting properties derived_data: Sorting on derived_data`:

Dec 19 '22 19:12 haneslinger

Some notes on use case:

new data imported daily
dozens of derived columns
Derived columns that reference other derived columns will NOT be allowed
Investigate when to perform the recalculation (at import time / data change time, or manually in the background)
If recalculation is not performed in real time, how do we communicate that to the user in the UI? (what visual indicator is there to signal that the derived columns are old values?)
There could be a 'recalculation' endpoint that could be hit programmatically outside of SEED after imports

@haneslinger will do some benchmarking on a large db to see how 'expensive' caching the fields and recalculating them will be.

Mar 21 '24 20:03 kflemin

right-o, ive done some benchmarking

clean org
3 derived columns
uploading portfolio-manager-sample.csv (500+ properties)

	develop	cached derived columns
upload file	3s	103s
map	8s	~2min
finish mapping	53s	~2min
load inventory	~2s	~2s

Mar 25 '24 18:03 haneslinger

let's await the import performance updates and revisit this issue!

May 31 '24 17:05 kflemin

seed seed copied to clipboard

Save Derived Column values to PropertyState

Background

Issue

A potential solution:

A potential solution:

seed
seed copied to clipboard