modelskill icon indicating copy to clipboard operation
modelskill copied to clipboard

Problems with match() (flooding/drying)

Open ducvi opened this issue 10 months ago • 10 comments

Hi,

I'm experiencing a bug with the match() function. Sometimes, I get an error in the function_get_global_start_end() defined in matching.py. My understanding is that this function checks the begin and end times of the model results passed to it and defines those as the "period" to match the observations.

When trying to match, I sometimes get an AssertionError in the following line:

assert all([len(x) > 0 for x in idxs])

The model results are of type DfsuModelResult and the observations are of type PointObservation. Both have well-defined start and end times, but they do not overlap perfectly.

Is there something I'm missing here?

Best regards,

Victor

ducvi avatar Feb 26 '25 13:02 ducvi

In order to make a fair comparison between different model results they should be compared on overlapping periods, and obviousle overlap with the observed period. This is at least how I think the matching in ModelSkill works at the moment.

It seems (based on the error) that at least on the model results doesn't overlap with the observation period.

You can check the start and end times for a list of model results: Image

ecomodeller avatar Feb 26 '25 14:02 ecomodeller

temporal_coverage can also be useful for this assessment https://dhi.github.io/modelskill/user-guide/plotting.html#temporal-coverage

jsmariegaard avatar Feb 26 '25 14:02 jsmariegaard

Thank you for the quick responses. You can see that indeed, my obs and my mr overlap :

Image

This is why I'm quite confuse with the error.

ducvi avatar Feb 26 '25 14:02 ducvi

Is the point inside the domain? (it could be that the error message could be improved in this case)

jsmariegaard avatar Feb 26 '25 14:02 jsmariegaard

It is, althought at the very edge.

Image

This bit of code I'm using forces the observation to be inside the domain if it is close enough to it. I'm using find_nearest_element to find the closest element in the model and I use these coordinates as the location of the observation.

element_id, distance = ds.geometry.find_nearest_elements(x=x2, y=y2, return_distances=True)
            if distance < 1000:
                x2, y2, ze = ds.geometry.element_coordinates[element_id]
                da = ds.sel(x=x2, y=y2)
                obs.append(ms.PointObservation(data, x=x2, y=y2, item='Niveau')

ducvi avatar Feb 26 '25 14:02 ducvi

Thanks. Could you share the code you use for the comparison as well as the full stack trace of the error?

jsmariegaard avatar Feb 26 '25 14:02 jsmariegaard

I hope this helps.

Call stack :

Traceback (most recent call last):
  File "c:\Users\ducvi02\Documents\MIKE\Dossiers\2022-11-22 Baie-Trinité\Python\MODEL_SKILL.py", line 265, in <module>
    con = ms.match(obs, mr)
          ^^^^^^^^^^^^^^^^^
  File "C:\Users\ducvi02\AppData\Local\Programs\Python\venv\Lib\site-packages\modelskill\matching.py", line 279, in match
    clist = [
            ^
  File "C:\Users\ducvi02\AppData\Local\Programs\Python\venv\Lib\site-packages\modelskill\matching.py", line 280, in <listcomp>
    _single_obs_compare(
  File "C:\Users\ducvi02\AppData\Local\Programs\Python\venv\Lib\site-packages\modelskill\matching.py", line 311, in _single_obs_compare       
    matched_data = match_space_time(
                   ^^^^^^^^^^^^^^^^^
  File "C:\Users\ducvi02\AppData\Local\Programs\Python\venv\Lib\site-packages\modelskill\matching.py", line 360, in match_space_time
    period = _get_global_start_end(idxs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\ducvi02\AppData\Local\Programs\Python\venv\Lib\site-packages\modelskill\matching.py", line 323, in _get_global_start_end     
    assert all([len(x) > 0 for x in idxs])
AssertionError

Code :

# Some imports
import mikeio as mio
import modelskill as ms
import pyproj
import os
import tqdm.notebook

# Define a coordinate transformer and load data

transformer = pyproj.Transformer.from_crs(crs_from=EPSG2, crs_to=EPSG1, always_xy=False, authority='epsg')
item = 'Surface elevation'
mr = ms.model_result(dossier/path_results, item=item, name="Modèle")
ds = mio.read(dossier/path_results, items=item)
obs = list()

# For all potential observations

for filename in tqdm.notebook.tqdm(os.listdir(dossier/"Validation"/path_results.parent.stem)):
    filename = pt.Path(filename)

    # Open observation file

    if filename.suffix == ".dfs0":
        print(filename)
        data = mio.read(dossier/"Validation"/path_results.parent.stem/filename)

        # Extract latitude and longitude of observation data inside a secondary json file

        with open(dossier/"Validation"/path_results.parent.stem/(filename.stem+"_metadata.json"), encoding='utf-8') as f:
            metadata = json.load(f)
            x2, y2 = transformer.transform(metadata['latitude'], metadata['longitude'])

            # Find nearest element of the model.
            element_id, distance = ds.geometry.find_nearest_elements(x=x2, y=y2, return_distances=True)

            max_distance = 1000
            
            if distance < max_distance:
                # If said element is close enough, fixes the observation at the nearest element
                x2, y2, ze = ds.geometry.element_coordinates[element_id]
                da = ds.sel(x=x2, y=y2)
                da.to_dfs(dossier/"Validation\Comparaison"/path_results.parent.stem/(metadata['officialName']+'.dfs0'), dtype=np.float64)
                obs.append(ms.PointObservation(data, x=x2, y=y2, item='Niveau', name=filename.stem))
            else:
                # Do not add the observation to the list.
                print('L\'observation {} est à plus de {} m d\'un élément'.format(filename.stem,max_distance))

con = ms.match(obs, mr)

ducvi avatar Feb 26 '25 15:02 ducvi

This approach seems very sensible for coastal locations. However, I can not manage to reproduce this error. Could you please create a minimal reproducible example?

ecomodeller avatar Feb 27 '25 08:02 ecomodeller

Thanks for your help,

I dug around and solved the case,

I can reproduce the Assertion Error with this minimal code. Like I said at the beginning, this sometimes fails but not always.

# Module import
import mikeio as mio
import modelskill as ms

# File path

model_path = "C:\ .... "
obs_path = "C:\ .... "

# Load model results and observation file

mr = ms.model_result(model_path, item="Surface elevation", name="Model")
data = mio.read(obs_path)

# Set coordinates of observation
# Lets say
x = 88020 
y = 602859

obs = ms.PointObservation(data,x=x,y=y,item='Surface Elevation',name='Observation')

con = ms.match(obs, mr)

So the thing is that I set my observation in an element that experience flooding and drying. The element happens to be dry for the whole simulation. Hence the error indicate that the mr len < 0 in that element. There is nothing to extract.

Maybe the match function could be improve to check for flooding and drying or something like that.

Thanks again !

ducvi avatar Feb 27 '25 16:02 ducvi

Thanks for reporting back @ducvi ! Yes it would be a good idea if modelskill provided a better error message in this case.

jsmariegaard avatar Feb 28 '25 07:02 jsmariegaard