iRODSDataObject(...).modify_time is arbitrarily chosen
The various DataObject model attributes are chosen based on the first row in the query result-set, which is ordered by replica number. This affects the data object's "overall" timestamp info among other attributes:
session=irods.helpers.make_session( )
data_obj = session.data_objects.get( path_to_data_object )
print(repr(data_obj.modify_time))# -> this will not give the most recent replica's timestamp
max(r[DataObject.modify_time] for r in data_obj.replicas) # -> this will give most recent replica's timestamp
The printed datetime object will not necessarily reflect the most recent replica's modification timestamp.
For that we'd have to put in a hook to sort the result set before transferring the attributes to the main object. (I am thinking that is the most efficient option,and most backward compatible, since the PRC mostly relays cached attributes anyway and relies on the user to do fresh queries to re-poll the object for changes.)
This will also affect the access_time attribute for iRODS 5 when that is ultimately added.
So... we have talked about the canonical answer for a data object that actually is a per-replica bit of information... we've agreed that we should use the "latest, good replica" to report as the object information.
So, status=1 and the latest modify time...
So... we have talked about the canonical answer for a data object that actually is a per-replica bit of information... we've agreed that we should use the "latest, good replica" to report as the object information.
So, status=1 and the latest modify time...
That should work
However, access tiime doesn't necessarily correlate with the status flag. So perhaps access_time along with some other replica attributes should not appear in the data object itself
hmm, default access_time should probably also be 'latest, good replica'.... is there a downside to that?
I need to check into it. If access time is independent of modify time then the higher level abstraction (data object) can't always accurately reflect the latest of both. One replica may have been more recently read, another one more recently written, or opened for write, and both could still have good status. If my understanding is accurate. I'm which case these timestamps could(should) be properties.
Or maybe I'm wrong? Is opening a replica for write enough to change the modify timestamp?
i think they are indeed independent in the way you say. opening for write does not mean it wrote anything... i think we (should) key on data_modified somewhere...
I believe the mtime is only updated when data is modified. The atime is only considered for updating if the replica is opened for reading.
I believe there's an argument for not doing anything here. As long as the replicas are available, developers can get the information they want. It's possible they may not want the mtime/atime for the latest good replica.
A better approach would be to document the current behavior and consider whether providing one or two free functions is good enough.
I believe there's an argument for not doing anything here. As long as the replicas are available, developers can get the information they want. It's possible they may not want the mtime/atime for the latest good replica.
I think I'm fine with that. The only problem - and it's admittedly small but could be perplexing for beginners and annoying for practiced library users - is that the existence of these replica attributes on the main iRODSDataObject could be misleading. You have an access_time now advertised to you in the top level object, and like the modify_time it's a datetime object. But then you find out it's not necessarily accurate...
It's been an issue for a long time, but we have less of an excuse now and it's because we're adding the access_time analog to the existing modify_time attribute.
One way to deal with that is to deprecate those members on iRODSDataObject. That's just an idea.
Alternatively, we do as you said and make modify_time and access_time properties with additional logic which handles sorting, etc on demand.
Play with the different schemes and weigh the pros/cons to determine which one aligns with the design of the PRC.
Yes, could be that a mixture of approaches would be good. Some of the replica attributes are ok because they agree over all rows. Some like resc_name can of course be deprecated
One way to deal with that is to deprecate those members on
iRODSDataObject. That's just an idea.Alternatively, we do as you said and make modify_time and access_time properties with additional logic which handles sorting, etc on demand.
Play with the different schemes and weigh the pros/cons to determine which one aligns with the design of the PRC.
Or rather resc_id can be deprecated, at the iRODSDataObject level.