run_summary
In the same way that mth5 has a channel_summary method, that returns a dataframe with info about each channel, it would be nice to have a lighter-weight version that only returned one row per run.
This can be achieved by running a group_by on the channel_summary. grouper = df.groupby(["station", "run"])
A method that does this already is in aurora/aurora/tf_kernel/dataset.py, on the issue31 branch, which will soon be dev branch. The method is called channel_summary_to_dataset_definition
If run_summary is not appreciably faster than channel_summary, then it would probably be best to make run_summary depend explicitly on channel_summary as in my example.
grouper = df.groupby(["station", "run"])
@kujaku11 Let's put this one on the backburner until after we have merged our branches into dev
@kkappler I think once you get the format for the Dataset Definition we can pretty easily create that from the channel_summary using pandas groupby.
This is implemented on features and this issue can be closed once the initial features branch merges to master.