fermentrack
fermentrack copied to clipboard
Log file trimming
As discussed on HBT: something like this could be used to trim csv files, using pandas.
import time, datetime, pytz, pandas
utc_tz = pytz.timezone("UTC")
def trim_csv(pathtocsv, trim_start, trim_end):
logfile = pandas.read_csv(pathtocsv)
trim_start_tz = trim_start.astimezone(utc_tz)
trim_end_tz = trim_end.astimezone(utc_tz)
logfile = logfile[[not (trim_start_tz < datetime.datetime.strptime(i,'%Y/%m/%d %H:%M:%SZ').replace(tzinfo=utc_tz) < trim_end_tz) for i in logfile.log_time]]
logfile.to_csv(pathtocsv, index=False)
That's awesome! Would be helpful for trimming off the beginning/end of logs in case things need to either stabilize (or you forgot to shut off logging when unhooking everything)
Any thoughts on automatic identification of outliers? Not sure if it's a realistic thought, or a pipe dream...
It’s possible. There is a handful of ways I think that could be done too. Outliers in this application I would think are something to highlight rather than remove though. Could indicate a stuck relay, or a taped temp sensor falling off, chamber door coming open, wiring/sensor connectivity issue, or other things where manual intervention is required. Real time alerts of this while it is happening would IMO be the most useful thing you could do in relation to outliers.
If it is something that is happening so regularly that trimming out sections of the log is inconvenient - i.e. very “peaky” temp readings - the solution would be to change the appropriate filter delay.
Any thoughts?
On Jul 16, 2018, at 11:44 PM, John [email protected] wrote:
That's awesome! Would be helpful for trimming off the beginning/end of logs in case things need to either stabilize (or you forgot to shut off logging when unhooking everything)
Any thoughts on automatic identification of outliers? Not sure if it's a realistic thought, or a pipe dream...
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
I think it depends on how you classify an outlier —
In my head, an outlier is an erroneous data point — where a sensor has a hiccup and reports a single 90 deg reading in the middle of 70 deg readings, or where a fermenter gets moved and for a minute or two the gravity readings bounce all over the place. In those cases the data to me is meaningless as it’s not a real indication of what’s happening in the fermentation.
I think what you’re describing is something else - and far more useful from a daily use perspective. To the extent that we can detect/notify the user about thermal runaway (or an open fermenter door, low ice in a “son of fermentation chamber”, etc) I can see that being incredibly useful.
That said, I can see runaway detection being a difficult feature to implement.
OK. Without getting into modelling I would propose something simple like follows (and happy to talk about temperature and gravity modelling if you want to go there 🙂)
import pandas
def remove_erroneous_temps(pathtocsv, threshold):
# warning: this irreversibly alters your temperature log file
logfile = pandas.read_csv(pathtocsv)
logfile = logfile[[-abs(threshold) < (logfile["beer_temp"][i] - logfile["beer_set"][i]) < abs(threshold) for i in range(0, logfile.shape[0])]]
logfile.to_csv(pathtocsv, index=False)