nowcasting_dataset
nowcasting_dataset copied to clipboard
Prepare batches of data for training machine learning solar electricity nowcasting data
## Detailed Description For example, `pbzip2` reduces our NWP batches to 20% of their original size. Hopefully we can achieve similar reductions using "proper" NetCDF compression algorithms. Smaller batches should...
## Detailed Description The new OpticalFlowDataSource writes predicted satellite images to disk. It could also write the optical flow _field_ to disk (i.e. the estimated x and y displacement). ##...
(not urgent.. this can wait until 2022 :slightly_smiling_face: ) I fear that using random numbers in our tests could lead to intermittent (and hence hard-to-debug) unittest failures. Let's not worry...
(Let's not worry about this now... just making a note to discuss in early 2022!) As we all know, in order for "fake" data to be useful for testing, the...
The paper described in this video https://youtu.be/FbRcbM4T-50 suggests that pre-training on a mix of diverse datasets is helpful. For us, we want our models to learn video prediction. So we...
The notebook created in PR #506 works. But it needs some manual intervention (e.g. to handle the slightly different data layout in batches for different modalities). This issue is about...
The Pydantic models in the [`data_sources/`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/nowcasting_dataset/data_sources/)`/_model.py` files (where `` is one of {datetime, metadata, gsp, nwp, pv, satellite, sun, topographic}) describe the contents of the batches (and, hence, the contents...
At the moment, the workflow is: 1. Specify the number of batches for each split in the config YAML 2. Run `prepare_ml_data.py`. Which first creates CSV files specifying the locations...
For the Zarr DataSources, it may be faster to load the data into memory _after_ joining (lazily loaded) examples. i.e. call `.load()` towards the end of `get_batch()` instead of at...
## Detailed Description For example, the logs for #437 are ambiguous about which Datasource is producing the warning. ## Possible Implementation I think we need some code somewhere which wraps...