ismn
ismn copied to clipboard
No 'variable' in station '3.09' and min_depth/max_depth don't work
I am trying to extract data from station 3.09, about the variable 'soil_moisture' from depth 0.01 to 0.04. By default I should write command 1
like this:
min_depth,max_depth=0.01, 0.04
ids = ismn_data.get_dataset_ids(variable='soil_moisture',
min_depth=min_depth,
max_depth=max_depth,
filter_meta_dict={'station': '3.09',
'lc_2000': [10,11,12,20,60,130],
'lc_2005': [10,11,12,20,60,130],
'lc_2010': [10,11,12,20,60,130],})
But this yields no element in ids
. So I tried to print the metadata for 3.09
by ismn_data.read(1098, return_meta=True)
, and found there is indeed soil moisture in the metadata, but no values in variable
:
ismn_data.read(1098, return_meta=True)
Out[117]:
( soil_moisture soil_moisture_flag soil_moisture_orig_flag
date_time
2017-01-01 00:00:00 0.192 G M
... ... ...
2019-02-22 09:00:00 0.155 G M
[12745 rows x 3 columns],
variable key
clay_fraction val 5.2
depth_from 0.0
depth_to 0.05
climate_KG val Dfb
climate_insitu val unknown
elevation val 104.0
instrument val Decagon-5TE-B
depth_from 0.0
depth_to 0.05
latitude val 55.8609
lc_2000 val 10
lc_2005 val 10
lc_2010 val 10
lc_insitu val None
longitude val 9.2945
network val HOBE
organic_carbon val 0.5
depth_from 0.0
depth_to 0.3
sand_fraction val 85.1
depth_from 0.0
depth_to 0.05
saturation val 0.41
depth_from 0.0
depth_to 0.3
silt_fraction val 5.7
depth_from 0.0
depth_to 0.05
station val 3.09
timerange_from val 2017-01-01 00:00:00
timerange_to val 2019-02-22 09:00:00
variable val soil_moisture
depth_from 0.0
depth_to 0.05
Name: data, dtype: object)
You can see, right above clay_fraction
, there is no value of key variable
. So I have to use command 2
ids = ismn_data.get_dataset_ids(variable=None,min_depth=min_depth,
max_depth=max_depth,
filter_meta_dict={'station': '3.09',
'variable'='soil_moisture',
'lc_2000': [10,11,12,20,60,130],
'lc_2005': [10,11,12,20,60,130],
'lc_2010': [10,11,12,20,60,130],})
but still get nothing in ids
. I found that's because I set min_depth
and max_depth
. If I delete min_depth and max_depth in command 2
, I can get ids
as [1098, 1104]
. But I do want to extract values between 0.01 and 0.04. So is there anything wrong in the data on 3.09
? And I am confused what's the difference bewteen command 1
and command 2
?
is there anybody who can help me?
Hi, sorry for the late reply. The problem is that at station 3.09, soil moisture sensors are operating between 0 and 5 cm, while your query looks for sensors between 1 and 4 cm (which do not exist).
you can see it by selecting the station and by listing all sensor names (the numbers in the name refer to the depths in meters). Please note that your dataset might look differently, I am using an older snapshot of ISMN here.
>> ismn_data['HOBE']['3.09']
Out[22]: Sensors at '3.09': ['Decagon-5TE-A_soil_moisture_0.000000_0.050000', 'Decagon-5TE-B_soil_moisture_0.000000_0.050000', 'Decagon-5TE-A_soil_moisture_0.200000_0.250000', 'Decagon-5TE-B_soil_moisture_0.200000_0.250000', 'Decagon-5TE_soil_moisture_0.500000_0.550000', 'Decagon-5TE-A_soil_temperature_0.000000_0.050000', 'Decagon-5TE-B_soil_temperature_0.000000_0.050000', 'Decagon-5TE-A_soil_temperature_0.200000_0.250000', 'Decagon-5TE-B_soil_temperature_0.200000_0.250000', 'Decagon-5TE_soil_temperature_0.500000_0.550000']
My suggestion is, to be less restrictive and allow sensors from e.g 0 to 5 cm instead of 1 to 4 cm
Thanks for your reply @wpreimes! But I want to loop over all European stations. So I am not able to print all sensors out, then select the sensor one by one...besides, I am comparing to my model simulations of soil moisture at each layer ([0, 0.01, 0.04, 0.1, 0.2, 0.4, 0.6, 0.8, 1]
meters). So I am finding a way to match the depth of ISMN stations to my model layers, at the same lat/lon grid. For example, my model simulations of grid (containing 3.09) from 0.01 to 0.04 m are matched to observations from 0 to 5 cm on station 3.09
. And model simulations of grid containing station X from Y_1 to Y_2 depth are matched observations from Z_1 to Z_2 depth on station X, where [Z_1, Z_2]
contains [Y_1, Y_2]
, or [Y_1, Y_2]
contains [Z_1, Z_2]
, as long as the observation 'match'
the model layers.
The other problem is that, I saw that all depth configurations are different across all European ISMN stations. For example, at other stations, they might have depths like 0 to 8 cm....so I cannot just write a loop to run codes of 3.09
to other stations...so is there any way to solve my problems? Thanks!
Printing the names was only meant as an example to explain the problem for that specific station. Matching the different layers between model and insitu data is not straight forward as you noticed. Some tradeoffs will be necessary, especially for sensors that cover a wide range of depths.
Here are some suggestions:
- Use the starting depth of a sensor only to assign a sensor to the layer it starts measuring in (
get_dataset_ids
has a keyword argument for that calledcheck_only_sensor_depth_from
), but you might want to manually exclude sensors that cover a wide range of depths afterwards (when looping over the extracted sensors, check e.g. the difference between thedepth_to
anddepth_from
metadata attribute) - Use sensors multiple times. If a sensor measures between 0 and 5 cm I think it is fair to use it in the comparison for the first 3 layers.
- design your own solution to match the model and insitu layers, to e.g. only compare the "best matching" layer (e.g. the layer with the largest overlap). In that case you could extract all ids for
soil_moisture
without depth restrictions, loop over them to read the data and use the available metadata / depth information to apply your own code to assign them to your model layer (e.g. for each model layer check whether it is in the range of the ismn sensor, and if it is, use it for that layer). This function might also help https://github.com/TUW-GEO/ismn/blob/master/src/ismn/meta.py#L144
Also, I'm not sure if there are even any sensors that measure SM e.g. between 1 and 4 cm depth in ISMN at all. Just to strengthen my point about making some compromises in your approach. @daberer might know that.
Thanks! @wpreimes, let me try your suggestions first
Hi, I think for the majority of soil moisture sensors at ISMN the sensor orientation is horizontal (depth_from = depth_to). I checked there are 271 soil moisture sensors within 1 - 4cm bracket if the margin-values (1 and 4cm) are included, mostly from the networks HiWATER_EHWSN and SMN-SDR. Often networks have a similar composition for all locations (same sensors in the same depths), but overall the depths are quite diverse as you noticed.
Hi, sorry for the late reply. The problem is that at station 3.09, soil moisture sensors are operating between 0 and 5 cm, while your query looks for sensors between 1 and 4 cm (which do not exist).
Hi @wpreimes thanks for your comment, but there exists another problem. If I tried with:
ids = ismn_data.get_dataset_ids(variable='soil_moisture',
filter_meta_dict={'station': '3.09',
'lc_2000': [10,11,12,20,60,130],
'lc_2005': [10,11,12,20,60,130],
'lc_2010': [10,11,12,20,60,130],})
I can get nothing. That's because in the metadata of 3.09
, the value of key variable
is None. So I have to use:
ids = ismn_data.get_dataset_ids(variable=None,
filter_meta_dict={'station': '3.09',
'variable'='soil_moisture',
'lc_2000': [10,11,12,20,60,130],
'lc_2005': [10,11,12,20,60,130],
'lc_2010': [10,11,12,20,60,130],})
I have to write 'variable'='soil_moisture',
in the filter_meta_dict
. Is that normal? because I can use the first command for other stations, except for 3.09
. So does it mean there is a bug in metadata of 3.09
? And is it ok to use the second one for other stations? For other details please refer to the description of the issue at the top of this page. Thanks!
Hi, I just downloaded the ISMN data for HOBE and tried the 2 function calls you posted and I got the same IDs for both of them. About the metadata, I don't understand what you mean with "metadata of 3.09, the value of key variable is None.". You posted the metadata table in your initial comment, and there you see the "variable" is "soil moisture" for the selected sensor (the last 4 lines, the first line is only the labels for the data frame)
variable key
clay_fraction val 5.2
depth_from 0.0
depth_to 0.05
climate_KG val Dfb
climate_insitu val unknown
elevation val 104.0
instrument val Decagon-5TE-B
depth_from 0.0
depth_to 0.05
latitude val 55.8609
lc_2000 val 10
lc_2005 val 10
lc_2010 val 10
lc_insitu val None
longitude val 9.2945
network val HOBE
organic_carbon val 0.5
depth_from 0.0
depth_to 0.3
sand_fraction val 85.1
depth_from 0.0
depth_to 0.05
saturation val 0.41
depth_from 0.0
depth_to 0.3
silt_fraction val 5.7
depth_from 0.0
depth_to 0.05
station val 3.09
timerange_from val 2017-01-01 00:00:00
timerange_to val 2019-02-22 09:00:00
variable val soil_moisture
depth_from 0.0
depth_to 0.05
Name: data, dtype: object)
and you can access it e.g. via
>> ismn_data.read_metadata(1098)['variable']
key
val soil_moisture
depth_from 0.0
depth_to 0.05
Name: data, dtype: object
maybe you want to re-generate the python metadata if you feel that something is wrong there (removing or renaming the folder python_metadata
in the ISMN data path should lead to re-collecting the metadata the next time you initialize the reader). Make sure you have the latest version of this package installed. In case the data is erroneous you can try and download the HOBE data separately again and replace the files in your collection with the new ones (make sure to re-collect the metadata when you change your local data collection).
Hi @wpreimes, thanks for your reply. By ismn.__version__
I got the version is '1.1.0'
. Is it the latests one?
And I found that I can use the variable='soil_moisture'
outside the filter_meta_dict
. By the way, if I try ismn_data['HOBE']['3.09']
, there is Decagon-5TE-B_soil_moisture_0.200000_0.250000'
which means soil moisture from 0.2 to 0.25m. But just using ismn_data.read_metadata(1098)['variable']
doesn't show this....
v1.2.0
would be the latest. You can try pip install -U ismn
to upgrade.
The commands ISMN_Interface.read_metadata()
(and ISMN_Interface.read_ts
) read data for certain ID. the ID refers to a specific sensor (as indicated by contents of the metadata). At a station such as HOBE 3.09 there can be multiple sensors. In your case, 1098 is the ID of the soil moisture sensor at this station in 0-5 cm depth, and your command is reading the metadata for that sensor. The sensor in depth 0.2-0.25 is different, and therefore has a different ID.
Thanks for your reply! @wpreimes but when I tried v1.2.0
, it seems to be an error when I was reading data Data_separate_files_header_20170101_20211231_9078_Zd6I_20220911
(this is a file I downloaded from ISMN station, containing lat from 36N to 58N and lon from 11.75W to 29.5E.):
ismn_data = ISMN_Interface(data_path)
Files Processed: 100%|██████████| 321/321 [00:00<00:00, 4521.32it/s]Processing metadata for all ismn stations into folder /Users/xushan/research/TUD/ISMN_westEurope/Data_separate_files_header_20170101_20211231_9078_Zd6I_20220911.
This may take a few minutes, but is only done once...
Hint: Use `parallel=True` to speed up metadata generation for large datasets
Metadata generation finished after 0 Seconds.
Metadata and Log stored in /Users/xushan/research/TUD/ISMN_westEurope/Data_separate_files_header_20170101_20211231_9078_Zd6I_20220911/python_metadata
Traceback (most recent call last):
File "<ipython-input-23-84af3e3a7ed0>", line 1, in <module>
ismn_data = ISMN_Interface(data_path)
File "/Users/xushan/opt/anaconda3/lib/python3.7/site-packages/ismn/interface.py", line 135, in __init__
self.activate_network(network=network, meta_path=meta_path, temp_root=temp_root)
File "/Users/xushan/opt/anaconda3/lib/python3.7/site-packages/ismn/interface.py", line 166, in activate_network
self.__file_collection.to_metadata_csv(meta_csv_file)
File "/Users/xushan/opt/anaconda3/lib/python3.7/site-packages/ismn/filecollection.py", line 403, in to_metadata_csv
dfs = pd.concat(dfs, axis=0, sort=True)
File "/Users/xushan/opt/anaconda3/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/Users/xushan/opt/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 304, in concat
sort=sort,
File "/Users/xushan/opt/anaconda3/lib/python3.7/site-packages/pandas/core/reshape/concat.py", line 351, in __init__
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
Can you please help me with this? Thanks!