hydromt icon indicating copy to clipboard operation
hydromt copied to clipboard

Grid inherits from SpatialModelComponent

Open deltamarnix opened this issue 1 year ago • 3 comments

Issue addressed

Fixes #847

Explanation

Region component is now a new base class that inherits from ModelComponent. Any component that should also have its own region can inherit from SpatialModelComponent. The _region_data property can be overridden to return the region that fits the data on that component. Like the region over a grid, or a mesh.

The region argument is removed from the command line.

Users can set up their region in a component via the arguments in create. parse_region can be found as a workflow.

global:
  components:
    grid:
      type: GridComponent
steps:
  - grid.create:
      region:
        bbox: [1,2,3,4]

The Model.region is determined by the only SpatialModelComponent in the model. If there are more, it should be specified in the config via region_component.

global:
  components:
    grid:
      type: GridComponent
    subgrid:
      type: GridComponent
  region_component: grid

Checklist

  • [ ] Updated tests or added new tests
  • [x] Branch is up to date with v1
  • [x] Tests & pre-commit hooks pass
  • [x] Updated documentation if needed
  • [ ] Updated changelog.rst if needed
  • [x] For predefined catalogs: update the catalog version in the file itself, the references in data/predefined_catalogs.yml, and the changelog in data/changelog.rst

Additional Notes (optional)

Add any additional notes or information that may be helpful.

deltamarnix avatar Apr 16 '24 11:04 deltamarnix

@hboisgon and @savente93 Here is my draft PR. The following things are not yet in:

  • [x] Assigning another region to reference to via the constructor of a component. This might go to a next PR?
  • [x] Model.crs seems to have changed in v1, and I'm not sure if we should keep this new target_crs.
  • [x] MeshComponent is not yet a SpatialModelComponent, since I want to wait for #792 .
  • [x] There was this scenario where an old model doesn't necessarily have a region.geojson file. And we should be able to infer it from the data. Not sure if we need that in SpatialModelComponent, or if that is going to be implemented per component. Could anyone give an example of what that would look like?

deltamarnix avatar Apr 16 '24 11:04 deltamarnix

I also checked the sonar cloud logs, and they are either about the TODO comments or about the deeply nested structure that was already there. I think this is ok for now, it's better to keep that the way it is and refactor that once we know everything is working well, so it shouldn't not block from merging imo

savente93 avatar Apr 16 '24 14:04 savente93

I have done another overhaul and came to a better design with @hboisgon . We went for a leaner SpatialModelComponent that only does the check for a reference component, and it can write the region if it is not referencing to another region. The other functions for bounds, and crs are just interface functions that need to be filled in by MeshComponent and GridComponent.

Not yet implemented: components that are only just pointing to another region. I believe these don't need any base class. For example a ForcingsComponent would just receive a str for region in the __init__ and grab the region from that component onwards.

deltamarnix avatar Apr 18 '24 15:04 deltamarnix

@hboisgon I made changes to the workflow functions for regions and grids. Workflow functions for grid only accept the basic data types, not dict. The parse_region function is split into multiple functions and components are now in charge of what kind it is before calling the correct function. I could even split the basin functions a bit further.

To me it makes sense to have more separate functions as most of the parameters are optional parameters that are only used for specific cases.

I would still love to get rid of _parse_region_value, but I first want to know what you think of the new setup.

I did my best to keep all functionality as implemented in main, but please do take a close look to all the different parse_region functions that I made and the way grid and mesh components are implementing create.

deltamarnix avatar May 06 '24 15:05 deltamarnix

@deltamarnix thanks for the last changes! I am now busy reviewing the PR so please do not change anything anymore before I am done! Thanks :)

hboisgon avatar May 16 '24 03:05 hboisgon

Hi @deltamarnix @savente93 and @Tjalling-dejong

Many many thanks to you all for all the work you did on restructuring the model structure of hydromt! I think it looks really good now!

Why I now tag you all: while reviewing the changes of @deltamarnix for missing docstrings or bugfixing etc I also decided to harmonize and have a second look at all model components and the restructure. So here is a couple of things I would like to draw your attention to and a list of remaining todos.

Changes

For all:

  • Model and workflows API docs is up-to date. Because of the new hydromt_step functionality, I split components functions into a CLI category (ie available from command line and hydromt workflow file) and the rest.
  • The migration guide for model should now be up-to-date and I hope will help plugins get started. We can split it for different users after alpha. I did not change much here, just made sure it was complete :)
  • added test_equal methods to all components but mesh
  • I decided to document Model.components property. It's just nice as a python user to glance at your available model components.

For @deltamarnix :

  • Region_component of grid and mesh: I think we/I were too harsh when I said that grid region will always come from grid and same for mesh. So I re-added for GridComponent and MeshComponent that region_component can point to another component. What I did instead is that for the create method, it raises an error if self._region_component is not None. And in the add_data_from_ methods, I now added some functions to get the grid like or mesh like data either from the component itself or from the reference. I hope you are okay with these changes? This way we do allow a forcing=GridComponent(model, region_component=grid) for which forcing.create_from_region won't work but forcing.add_data_from_rasterdataset will.
  • region_filename: I changed that each spatial component has its own name for the region filename to avoid conflicts.
  • GeomsComponent: implemented _region_data
  • DatasetsComponent and SpatialDatasetsComponent: because of the add_raster_data_from methods, we needed to have a region component for Datasets. Instead of making Datasets a SpatialComponent, I decided to split them. Hope you agree. See the line in the comments for Sam for extra details.
  • grid/mesh workflows: I really like that you split the grid/mesh creation in several functions and decided to make some of them public. I added an extra argument region_crs to the create method for the user to be able to specify the region crs in case of a bbox or if it cant be inferred from file.
  • parse_region: I really like the split :) I changed a little though so that the returned region will still be in its original crs. This mimics the data catalog behavior, where we do not reproject data but users can pass the crs of the data in case it cant be inferred from file. I think it is more intuitive that way if the behavior looks a bit like the data catalog.

For @savente93 :

  • DatasetsComponent: Because of the new SpatialComponent, I decide to split the dict of xarray object into DatasetsComponent if you do not need a region and SpatialDatasetsComponent if you do need a region. This is because the add_raster_data_from_ methods do need a region so these have been moved to SpatialDatasetsComponent. The one thing I do not like so much is that again the set/read/write methods are duplicated but I don't think I can easily have SpatialDatasetsComponent inherit from both DatasetsComponent and SpatialComponent no? Else I also added some tests and one is now skipped waiting for artifact data catalog.
  • VectorComponent: as vector was already different than the rest for default filename definition, I decided to go back to the behavior in main and remove the self._filename and self._geometry_filename properties. So filenames and the defaults are just defined in the vector.read and vector.write functions.

TODO

It would be good if you could have a look at the changes and hopefully you agree with these and else we can still discuss here before closing this PR. @deltamarnix could you still check the failing SonarCloud analysis and see if you can improve?

Before alpha release, there are still two main todos on the model side I think:

  • generate the docs to see if the API and migration guide print nicely for the plugins to get started with v1
  • when artifact data is enabled: update the mock tests for model and components --> while implementing my own test for SpatialDatasetsComponent, I realize that the mock test (at least for grid) would fail with actual data. To ensure we have the same behavior as before, I would really like to get back to the model tests from main. For some tests we can use fixture data and this is fine as the get_data method do handle python objects but the create and add_data_from functions should not be mocked to ensure they are working the way we expect them to (eg variable selection, renaming, reprojection etc). Do you also see this?

Once these are in, I think we are finally done with Model!! Again great job everyone and I am really happy with the progress and how the restructure looks like!

hboisgon avatar May 17 '24 06:05 hboisgon

I did not review the whole PR again, but I'm happy with the changes as described. As far as I'm concerned we're ready to merge, we can address the enabling of tests in a separate PR (imo) as this one is already quite massive.

savente93 avatar May 17 '24 10:05 savente93

whoooo! well done everyone! great work <3

savente93 avatar May 22 '24 10:05 savente93