django-auditlog
django-auditlog copied to clipboard
Need to be able to access serialized data from changed object
We need to be able to access a serialized dictionary of an object's field values following a change. I'd like to add a serialized_data JSON field to the LogEntry model. That field will be populated by the receivers which will implement django.core.serializers to serialize the new state of the object after a change.
Having a historical record of object states will help us to be able to recover from several types of potential data loss events and aid us in debugging system artifacts. It is also critical to help us to determine calculated values for post hoc analysis and reporting. For example, if we need to determine who had x insurance and lived in y state at z time, we can use the audit log to determine what the person had listed as their state of residence at time z without needing a separate one to many historical model for state of residence.
There are a lot of helpful public facing APIs that could be added to the project using this field as a foundation (e.g., get_field_value_at_time(field_name, timestamp)) but the first step is to create the serialized_data field itself. Also note that I am aware there are other packages out there, such as django-reversions, that do this. However, in our experience they don't fit our needs as a first-class audit log.
I am adding this functionality to our fork but would love it if you would consider adding it to the project. Please let me know if there is anything I can do to make that more likely. Look for my PR later today!
cc @hramezani
And to pile on, this is not an attempt to "restore" the data for a given point in time. It is only to capture the complete state instead of only a text representation of a given object as currently represented by the obj_repr field.
@sum-rock Thanks for creating this feature request.
I think I got the overall idea of the feature. adding a serialized_data JSON field in the LogEntry model and saving the object after change in a json format into the field.
Then you want to use the new field for finding the value of some field.
For example, if we need to determine who had x insurance and lived in y state at z time, we can use the audit log to determine what the person had listed as their state of residence at time z without needing a separate one to many historical model for state of residence.
Could you please provide a small example here? I mean by having a model and providing the data for the new serialized_data field and explaining how you do the query.
I am not sure about the feature. I would like to heat @Linkid and @alieh-rymasheuski opinions here.
Note: We can have the feature in disable mode by default and make it enable in settings.
@hramezani I figured the best way to show you my thoughts was in the PR. Take a look at what I've submitted and let me know what you all think. I'm happy to make changes if it has a better chance of getting merged!
@hramezani I am curious about the hesitancy.
The thought here is that most likely the __str__ function of a given model does not fully represent the object's state. Or am I mistaken that you generally expect the entire model state to be presented in the string representation?
The idea here is that this is a "wall at our back" to ensure that "everything" is audited, not just how something is represented.
@rposborne
@hramezani I am curious about the hesitancy.
I am thinking about maintenance. Also, we have to consider how the new feature is useful or how many people really use it. because in the end, you want to spend time on a feature that is used by most of the people not just a small part of it.
BTW, by making the feature disabled by default I think we are good.
The thought here is that most likely the str function of a given model does not fully represent the object's state. Or am I mistaken that you generally expect the entire model state to be presented in the string representation?
I wasn't part of this project from the beginning and couldn't say what was the reason for the string repr of the object.
by initial look at the PR, I can say having a field for storing Json repr of the objects seems a good idea. I have to think about the use case that is included in the PR get_field_value_at_timestamp
Trust me I understand that ongoing maintenance and it's impact in the OSS world and I thank you all deeply for the work that you do on this, and all the other projects you may work on. ❤️
Though, I do think that the saving of the serialized value can be considered low effort as long as Django itself is managing the serialization.
We are also happy to keep the surface area of the API smaller by removing things such as get_field_value_at_timestamp, if it removes some maintenance burden on you all.
Closed in https://github.com/jazzband/django-auditlog/commit/777bd537e7d0a41a0f5baa562f2a3cbe20080bae