obsplus
obsplus copied to clipboard
Ambiguous station_count column in events_to_df
- obsplus version: master
- Python version: 3.7.4
- Operating System: Ubuntu 18.04
Description
One of the columns in the output of events_to_df
is simply called station_count
, which is a deviation from the naming convention for the OriginQuality
object in obspy, which has an associated_station_count
(number of stations at which the event was observed) and an used_station_count
(number of stations from which data was used for origin computation). This is mildly confusing because from the label it is ambiguous as to which of these items the value refers to, or if it refers to something else entirely.
Proposed Change
Looking at the source code, it appears that station_count
is intended to be the equivalent of used_station_count
. I recommend changing the column name to reflect this. This might break things, but it should be relatively easy to update.
Sounds like a good idea to me. I wouldn't worry about breaking changes to much as this is a relatively obscure column. Just make sure to add something in the changelog.
Sounds good. I guess another thing, is this going to force everyone's EventBanks to re-index? That's where it might be a slightly bigger annoyance.
On Sat, Jul 25, 2020 at 9:52 PM Derrick [email protected] wrote:
Sounds like a good idea to me. I wouldn't worry about breaking changes to much as this is a relatively obscure column. Just make sure to add something in the changelog.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/niosh-mining/obsplus/issues/195#issuecomment-663935635, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADKRQJS74MYTHQH4TFY4B63R5OY7XANCNFSM4PHZMTJQ .
Ya, good point. It looks like 'station_countis used in the event bank index. At one point I had thought about restricting the index to only include columns necessary to perform the queries allowed by
get_events` but other programs (like our processing software) currently need some of the other columns.
Maybe we just leave this issue open until we have to make a more significant breaking change to EventBank
then tackle it then?
Yeah, I think that leaving it open would be fine. It's pretty easy to just work around it for what I'm doing right now.
As far as restricting the columns, how much of a pain would it be to
specify which columns from events_to_df
to include in the index? I know
that I personally use the extra columns a lot to look at location quality
information without having to go and convert the entire bank to a df, but I
can also see where not everyone would want or need that.
On Sat, Jul 25, 2020 at 10:16 PM Derrick [email protected] wrote:
Ya, good point. It looks like 'station_countis used in the event bank index. At one point I had thought about restricting the index to only include columns necessary to perform the queries allowed byget_events` but other programs (like our processing software) currently need some of the other columns.
Maybe we just leave this issue open until we have to make a more significant breaking change to EventBank then tackle it then?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/niosh-mining/obsplus/issues/195#issuecomment-663937027, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADKRQJRHPYSTPXWZER6Y6JTR5O32BANCNFSM4PHZMTJQ .