featureforge
featureforge copied to clipboard
stats manager should allow storing intermediate results
In a very long experiment, I would like to be able to incrementally submit results. This is useful if the experiment fails later, or if I want to make queries to see how is it going.
I'll be checking how easy that may be.
Do you have some prototypish hack for it?
I do. I will post it here soon.
Here is the code. Add an optional parameter finish, and only change the status to solved if finish=True. Drawback: the entire results must be provided each time.
def store_results(self, booking_ticket, results, finish=True):
"""
The only way of storing experiment results is by having the "booking ticket" (ie,
the result of a successfull booking).
Returns True if the storage succedded, and False if not.
Be aware that if you attempt to store results after the booking time expired,
it's totally possible that same experiment was booked for someone else.
finish -- if True, mark as solved and invalidate the ticket.
"""
query = {u'_id': booking_ticket,
self.experiment_status: self.STATUS_BOOKED}
update = {
'$set': {self.results_key: mongo_dict_key_sanitizer(results)},
}
if finish:
update['$set'][self.experiment_status] = self.STATUS_SOLVED
experiment = self.data.find_and_modify(query, update)
if experiment is None:
logger.warning(
"Experiment with booking_ticket %s wasn't stored, because not found on "
"stats database as waiting-results." % booking_ticket)
return False
else:
logger.info("Stored experiment results for ticket %s" % booking_ticket)
return True
And here is a method update_results
that allows updating only some sub-fields of the results field.
def update_results(self, booking_ticket, flat_results, finish=True):
"""
Same as store_results, but do not replace the entire results field.
Only update the given sub-fields of the results.
flat_results -- results fields to update. use dot notation for embedded
fields.
finish -- if True, mark as solved and invalidate the ticket.
"""
query = {u'_id': booking_ticket,
self.experiment_status: self.STATUS_BOOKED}
update = {'$set': {}}
for k, v in flat_results.items():
update['$set'][self.results_key + '.' + k] = v
if finish:
update['$set'][self.experiment_status] = self.STATUS_SOLVED
experiment = self.data.find_and_modify(query, update)
if experiment is None:
logger.warning(
"Experiment with booking_ticket %s wasn't stored, because not found on "
"stats database as waiting-results." % booking_ticket)
return False
else:
logger.info("Stored experiment results for ticket %s" % booking_ticket)
return True