unconf17 icon indicating copy to clipboard operation
unconf17 copied to clipboard

Minnesota Lakefinder

Open hrbrmstr opened this issue 7 years ago • 12 comments

http://www.dnr.state.mn.us/lakefind/index.html

There have been a few SO questions (btw: I don't think that search result is comprehensive but it's indicative) that need to get to the underlying, heavily nested JSON result.

Might be worth a pkg attempt. I'm not smart enough in the underlying data to know what to do on my own with it (I'd be making too many assumptions and not making the right connections/labels).

hrbrmstr avatar Apr 18 '17 13:04 hrbrmstr

Sounds great! I could contribute some domain knowledge on this (albeit a little light on the fisheries related issues). I wonder if a major outcome of this effort could be a detailed description of the development process so that people could write packages for the many similar database interfaces for other areas (I maintain a list with some at: https://jsta.github.io/limnology_models_data/). I am thinking less of "use this package" and more of "here's how we found the api endpoint + parameters" and "here's how you know that selenium is required".

jsta avatar Apr 18 '17 14:04 jsta

IMO that would be a superb resource for folks (that's a nice list of other databases, too).

On Tue, Apr 18, 2017 at 10:37 AM, Joseph Stachelek <[email protected]

wrote:

Sounds great! I could contribute some domain knowledge on this (albeit a little light on the fisheries related issues). I wonder if a major outcome of this effort could be a detailed description of the development process so that people could write packages for the many similar database interfaces for other areas (I maintain a list with some at: https://jsta.github.io/limnology_models_data/). I am thinking less of "use this package" and more of "here's how we found the api endpoint + parameters" and "here's how you know that selenium is required".

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf17/issues/42#issuecomment-294865043, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfHthP41PvdRcu3WNZ4BD580nc80vwHks5rxMqRgaJpZM4NAVoA .

hrbrmstr avatar Apr 18 '17 16:04 hrbrmstr

While I won't be there, I was planning on blocking off the 25th and 26th so that I can follow along remotely! Be very interested in what you come up with here and happy to contribute.

And nice list, @jsta! One thing that is becoming apparent (at least to me) is that a harmonized lakes database (at least for US, but also Canada) would be great. There are many folks working in similar directions (EPA, USGS, you and Patricia and others...). Lot of really cool things could happen if a National (North American) lakes database would come to pass. But I digress...

jhollist avatar Apr 20 '17 12:04 jhollist

@jhollist We'd love to have you there remotely. I'm guessing you're already on our Slack, and hopefully the team that you join can also have you connected by voice/video for at least part of it. You might ping Nick Tierney/Miles McBain to see how they pulled it off last year as part of Bob's team.

karthik avatar Apr 21 '17 01:04 karthik

Thanks @karthik! I will ping them and work with @hrbrmstr, @jsta, or others (interested in a lot of the issues e.g. #5 ) on best way to get looped in. One of these years I'll throw my hat in the ring to hopefully attend in person!

jhollist avatar Apr 21 '17 12:04 jhollist

One of these years I'll throw my hat in the ring to hopefully attend in person!

You should and we'd be delighted to have you in person!

karthik avatar Apr 23 '17 22:04 karthik

@jhollist Let me know if there's anything I can do to help you work remotely. As rOpenSci's community manager my unconf role will be 100% facilitation.

Nick Tierney said his main barrier was just Australian time zone. Group had meetings as needed via https://appear.in.

stefaniebutland avatar Apr 28 '17 19:04 stefaniebutland

@stefaniebutland Thanks! My plan at this point is to follow along via slack (although I need to track down my 2fa codes, b/c my authenticator isn't working ...) and GitHub. Thanks for the link to appear.in. That will be useful. If I have any other issues, I will let you know.

jhollist avatar Apr 28 '17 20:04 jhollist

I will also not be there, but I am most curious about this particular issue. Not because of the JSON, but because the dataset is interesting and I'm trying to learn new things. If this one goes forward or not, I'd like to try and participate in it as well. @stefaniebutland Is there an runconf17 Slack channel I need to join? I'm in General and Random thanks to @sckott

bhive01 avatar May 23 '17 16:05 bhive01

@hrbrmstr and @jsta If this gets any traction on Thursday, do hit me up on slack or twitter. I'll be following along 11-4:30 EDT and can hope on appear.in if a chat makes sense. Like @bhive01 I am interested in helping and especially so with anything lake related!

jhollist avatar May 23 '17 23:05 jhollist

Will do. I'm super interested to see how Thu will go :-)

On Tue, May 23, 2017 at 7:32 PM, Jeffrey W Hollister < [email protected]> wrote:

@hrbrmstr https://github.com/hrbrmstr and @jsta https://github.com/jsta If this gets any traction on Thursday, do hit me up on slack or twitter. I'll be following along 11-4:30 EDT and can hope on appear.in if a chat makes sense. Like @bhive01 https://github.com/bhive01 I am interested in helping and especially so with anything lake related!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/unconf17/issues/42#issuecomment-303566783, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfHttPF98euxTWT4ZZLPP5hH9gJLfQeks5r82yWgaJpZM4NAVoA .

hrbrmstr avatar May 24 '17 01:05 hrbrmstr

I took a look at the structure of the query results. You were not kidding about the nestedness. It makes sense to me to return results for a single lake as a single list of data frames. For example, a query like lakefinder_get(lake = "56011602") would return a list object with the following structure:

|__characteristics
    |__name
    |__id
    |__max_depth
    |__...
|__surveys
    |__id
    |__date
    |__quartile
    |__cpue
    |__species
    |__length
    |__...

It is not clear to me without further digging which columns in the survey object represent derived quantities versus unique data. For example, it seems that maximum_length and minimum_length are derived from fishCount. Is quartileCount also derived from fishCount? It seems like quartileWeight is unique (not derived) as there is no fishWeight column.

jsta avatar May 26 '17 17:05 jsta