phenodata icon indicating copy to clipboard operation
phenodata copied to clipboard

Using phenodata as a library

Open MarkusZehner opened this issue 4 years ago • 14 comments

Hi,

I am working on a database for a crop monitoring project. I really appreciate your work in this and the dwdweather2 repository!

What would be the best way to circumvent the console for this package? Is there an easy way to pipe the string to the options variable in phenodata.command.run()?

Thanks! markus

MarkusZehner avatar Jun 15 '20 22:06 MarkusZehner

Dear Markus,

I really appreciate your work on this.

Thanks for appreciating our work on this program.

[...] and the dwdweather2 repository!

Regarding weather information from DWD/CDC, we would also like to point out the fine python_dwd by @gutzbenj which sparked our interest just recently.

I am working on a database for a crop monitoring project.

I see what you are doing over at rcm_archive within loadwd.py. Good luck and let us know about any help you might need.

What would be the best way to circumvent the console for this package? Is there an easy way to pipe the string to the options variable in phenodata.command.run()?

You are actually asking how to use this module as a library? What about using these lines from phenodata.command.run() and trying to ramp it up from there?

cdc_client = DwdCdcClient(ftp=FTPSession())
humanizer = DwdPhenoDataHumanizer(language=options['language'], long_station=options['long-station'], show_ids=options['show-ids'])
client = DwdPhenoData(cdc=cdc_client, humanizer=humanizer, dataset=options.get('dataset'))

data = client.get_observations(options, humanize=options['humanize'])

With kind regards, Andreas.

amotl avatar Jun 16 '20 01:06 amotl

What about using these lines from phenodata.command.run() and trying to ramp it up from there?

Now, I see that this might not so easy. For a quick solution, I have been able to give you this hack on how to fake the parameters into sys.argv instead of having to shell out to the phenodata program:

proc_string = [
    'phenodata',
    'list-stations',
    '--source=dwd',
    '--dataset=immediate',
    '--all', '--format=csv'
]

import sys
import phenodata.command

sys.argv = proc_string
phenodata.command.run()

The same would also work for

proc_string = ['phenodata', 'observations', '--source=dwd',
               '--dataset=' + str(dataset),
               '--partition=' + str(partition),
               '--filename=' + str(crops),
               '--station-id=' + str(stations),
               '--year=' + str(years),
               '--format=csv']

However, you would still have to parse STDOUT again, which is kind of sad.

amotl avatar Jun 16 '20 02:06 amotl

At [1], you can now find two basic examples about how to use the module as a library in order to yield Pandas DataFrames for further downstream processing. That way, you will not have to convert the JSON or CSV output back, which would have been silly.

Currently, still all options have to be obtained, even if most of them are actually None. So, there's definitively room for improvement all over the place.

Please let me know if this will help you along.

[1] https://github.com/hiveeyes/phenodata/tree/master/examples

amotl avatar Jun 16 '20 02:06 amotl

Thanks for the examples! yes this definitely helps, as you might have seen the silly back conversion to a dict already took place, but using the pandas directly might be the easiest solution.

Also is there a reason why the cache is encrypted?

MarkusZehner avatar Jun 16 '20 07:06 MarkusZehner

Thanks again, running the client directly is much more efficient!

MarkusZehner avatar Jun 16 '20 10:06 MarkusZehner

Yes this definitely helps. Thanks for the examples!

You are welcome.

Also is there a reason why the cache is encrypted?

Are you sure about this detail? Maybe dogpile.cache just serializes the data using pickle under the hood?

I've stumbled upon the next problem: fcntl used by dogpile.cache is not compatible with windows. is there an easy fix for that? e.g. using sqlite3 as in dwdweather?

What about this guy? Have you been able to resolve it? Edit: Now I see https://github.com/MarkusZehner/rcm_archive/issues/1 by @Aranil. So, that would still be an issue?

amotl avatar Jun 17 '20 00:06 amotl

@Aranil was testing rcm_archive on windows, before that i was not aware of fcntl. It is not a huge problem, the final thing is intended to run on a server that should run on linux (also why i deleted that comment).

I tried to use the sqlite3 solution in panodata/dwdweather2 but i'm not familiar with dogpile.cache so i got lost quickly after list_plus and list_plus_real.

MarkusZehner avatar Jun 17 '20 09:06 MarkusZehner

Dear Markus,

as I am just revisiting this issue, I wanted to take the chance to tell you about Wetterdienst. You might want to prefer it over dwdweather2 these days.

With kind regards, Andreas.

cc @gutzbenj

amotl avatar Oct 27 '20 21:10 amotl

Dear Markus,

we recently worked on bringing phenodata and Grafana together, see [1]. The code at [2] might help you when trying to use this as a library.

With kind regards, Andreas.

[1] https://github.com/panodata/grafana-pandas-datasource/tree/2d624da/examples/phenodata-mellifera [2] https://github.com/panodata/grafana-pandas-datasource/blob/2d624da/examples/phenodata-mellifera/demo.py#L62-L101

amotl avatar Jan 06 '21 16:01 amotl

Dear Andreas,

thanks for keeping me in the loop! Though currently im in the last months of writing my thesis, and no longer working on this project.

Kind regards, Markus.

MarkusZehner avatar Jan 06 '21 17:01 MarkusZehner

Currently I'm in the last months of writing my thesis.

We wish you a happy new year and good success with your thesis.

As we are modernizing phenodata these days, we will be happy about a star from all people who value our work here. </fishing> ;].

amotl avatar Jan 06 '21 17:01 amotl

Thank you very much!

MarkusZehner avatar Jan 06 '21 17:01 MarkusZehner

Dear Markus,

we hope you are doing well, that you've finished your thesis properly, and that you are now travelling the world.

We just improved the documentation and added a dedicated section about how to use phenodata as a library ^1, sparked by your inquiry. Let me know if you find any details for improvement.

I'm in the last months of writing my thesis, and no longer working on this project.

We will also be happy to hear back about what you used phenodata for, if you are allowed to talk about it now. If you have any resources available you can share, it would be nice to add them to the documentation as references.

With kind regards, Andreas.

amotl avatar Apr 11 '23 10:04 amotl

Dear @MarkusZehner,

we continued our endeavor of unlocking DWD CDC open access data, and converged the phenology observation data into corresponding SQLite databases, making it very convenient for querying and filtering. More information at ^1^3 ff. We hope you like it.

With kind regards, Andreas.

/cc @Aranil, @MarkusRAdam, @lawacco

amotl avatar Apr 30 '23 13:04 amotl