Jack Kelly issues

Results 285 issues of


Jack Kelly

Start collecting intra-day Sheffield Solar PV Live Regional (GSP-level)

## Detailed Description See https://github.com/openclimatefix/predict_pv_yield/issues/90 for more context

enhancement

data

Add GSP shape to dataset

When the model is predicting PV yield for an entire Grid Supply Point region, it may be useful for the model to get a binary mask showing the geographical shape...

enhancement

data

Experiment with better compression for on-disk batches

## Detailed Description For example, `pbzip2` reduces our NWP batches to 20% of their original size. Hopefully we can achieve similar reductions using "proper" NetCDF compression algorithms. Smaller batches should...

enhancement

data

Write optical flow field to disk

## Detailed Description The new OpticalFlowDataSource writes predicted satellite images to disk. It could also write the optical flow _field_ to disk (i.e. the estimated x and y displacement). ##...

enhancement

data

Discussion: Should we be using random numbers in our tests?

(not urgent.. this can wait until 2022 :slightly_smiling_face: ) I fear that using random numbers in our tests could lead to intermittent (and hence hard-to-debug) unittest failures. Let's not worry...

discussion

Discussion: For testing, should we use "fake" data or a small amount of real data?

(Let's not worry about this now... just making a note to discuss in early 2022!) As we all know, in order for "fake" data to be useful for testing, the...

discussion

Pre-train on diverse mix of video datasets

The paper described in this video https://youtu.be/FbRcbM4T-50 suggests that pre-training on a mix of diverse datasets is helpful. For us, we want our models to learn video prediction. So we...

enhancement

data

Convert `notebooks/compute_stats_from_batches.ipynb` into a fully-automated script

The notebook created in PR #506 works. But it needs some manual intervention (e.g. to handle the slightly different data layout in batches for different modalities). This issue is about...

enhancement

Document the contents & shapes of the NetCDF files that `nowcasting_dataset` outputs?

The Pydantic models in the [`data_sources/`](https://github.com/openclimatefix/nowcasting_dataset/blob/main/nowcasting_dataset/data_sources/)`/_model.py` files (where `` is one of {datetime, metadata, gsp, nwp, pv, satellite, sun, topographic}) describe the contents of the batches (and, hence, the contents...

discussion

documentation

enhancement

good first issue

enable nowcasting dataloader to append to the CSV files to create more batches

At the moment, the workflow is: 1. Specify the number of batches for each split in the config YAML 2. Run `prepare_ml_data.py`. Which first creates CSV files specifying the locations...

enhancement