nowcasting_dataset icon indicating copy to clipboard operation
nowcasting_dataset copied to clipboard

Prepare batches of data for training machine learning solar electricity nowcasting data

Results 101 nowcasting_dataset issues
Sort by recently updated
recently updated
newest added

## Detailed Description For example, `pbzip2` reduces our NWP batches to 20% of their original size. Hopefully we can achieve similar reductions using "proper" NetCDF compression algorithms. Smaller batches should...

enhancement
data

## Detailed Description The new OpticalFlowDataSource writes predicted satellite images to disk. It could also write the optical flow _field_ to disk (i.e. the estimated x and y displacement). ##...

enhancement
data

(not urgent.. this can wait until 2022 :slightly_smiling_face: ) I fear that using random numbers in our tests could lead to intermittent (and hence hard-to-debug) unittest failures. Let's not worry...

discussion

(Let's not worry about this now... just making a note to discuss in early 2022!) As we all know, in order for "fake" data to be useful for testing, the...

discussion

The paper described in this video https://youtu.be/FbRcbM4T-50 suggests that pre-training on a mix of diverse datasets is helpful. For us, we want our models to learn video prediction. So we...

enhancement
data

The notebook created in PR #506 works. But it needs some manual intervention (e.g. to handle the slightly different data layout in batches for different modalities). This issue is about...

enhancement

The Pydantic models in the [`data_sources/`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/nowcasting_dataset/data_sources/)`/_model.py` files (where `` is one of {datetime, metadata, gsp, nwp, pv, satellite, sun, topographic}) describe the contents of the batches (and, hence, the contents...

discussion
documentation
enhancement
good first issue

At the moment, the workflow is: 1. Specify the number of batches for each split in the config YAML 2. Run `prepare_ml_data.py`. Which first creates CSV files specifying the locations...

enhancement

For the Zarr DataSources, it may be faster to load the data into memory _after_ joining (lazily loaded) examples. i.e. call `.load()` towards the end of `get_batch()` instead of at...

enhancement

## Detailed Description For example, the logs for #437 are ambiguous about which Datasource is producing the warning. ## Possible Implementation I think we need some code somewhere which wraps...

enhancement