luftdatenpumpe icon indicating copy to clipboard operation
luftdatenpumpe copied to clipboard

[OpenAQ] Ingest data from the OpenAQ API

Open amotl opened this issue 4 years ago • 3 comments

Introduction

OpenAQ's mission is to fight air inequality by opening up air quality data and connecting a diverse global, grassroots community of individuals and organizations.

Thoughts

We might think about integrating the Python wrapper for the Open AQ API in order to get maximum worldwide coverage without significant efforts. This could even make #12 obsolete.

See also Using the OpenAQ API to acquire open air quality information from Python.

amotl avatar Jan 08 '20 01:01 amotl

News

By 15b117ab and 84041a5b and starting from version 0.20.1, Luftdatenpumpe is now able to ingest data from the OpenAQ API. Right now, only "recent" data is acquired using the OpenAQ measurements API. As a date_from parameter, we use the last full hour at xx:00 until now, i.e. hourdate(utcnow() - 1h). However, we are not sure about this strategy yet.

Examples

It is recommended to apply a country filter in order to reduce the amount of data per invocation.

# Acquire data from EEA Germany
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=DE

# Acquire data from EEA Belgium
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=BE

# Acquire data from GIOS network in Poland
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=PL

# Acquire data from AirNow network in the U.S.
luftdatenpumpe readings --network=openaq --progress --reverse-geocode --country=US

Backlog

  • Implement --timespan option in order to support querying historical data.
  • Properly resolve the station_id using the OpenAQ Locations API, e.g. ^1. Currently, it looks like
    • [US] Yuba City or
    • [IN] Zoo Park, Hyderabad - TSPCB.
  • What about inactive stations?

cc @wetterfrosch

amotl avatar Jan 08 '20 06:01 amotl

It looks like not all stations report at the same time and interval.

While BE seems to report at T01:00:00 or T02:00:00, DE always reports at T00:00:00. On the other hand, NL reports each hour.

So, we will have to tune the date_from parameter when invoking the api.measurements() API call.

In general, the response of the OpenAQ sources API (sources) informs about the corresponding resolutions. While all undesignated items (resolution: null) might yield a daily resolution, some are offering data in either 1 hr, 15 min or 10 min (AU). So, we will have to use that information to compute the date_from parameter correctly in order to safely retrieve the latest data of the respective country.

amotl avatar Jan 08 '20 08:01 amotl

Other than resolving the details enumerated within this discussion, we may want to look at OpenAQ API Version 2 as well.

  • https://github.com/dhhagan/py-openaq/issues/41

amotl avatar Dec 13 '22 01:12 amotl