nowcasting_dataset icon indicating copy to clipboard operation
nowcasting_dataset copied to clipboard

Method to drop 'padded' pv systems

Open peterdudfield opened this issue 3 years ago • 2 comments

Detailed Description

Would be useful to have a method to drop 'padded' pv systems. These are padding out zero so that the dataset can be save in an efficient way. However for plotting its useful to drop the 'padded' systems

Would be also good to so this for 'gsp' too

Possible Implementation

in from nowcasting_dataset.data_sources.pv.pv_data_source import PV have function that removes any zero values. This would be for 'data', 'pv_systems', 'x_coords' and 'y_coords'

peterdudfield avatar Nov 12 '21 09:11 peterdudfield

Interesting!

I'm probably remembering wrong but I thought the code padded PV data with NaNs, not zeros?

Is it possible that the zeros are "legitimate"? (e.g. zero power generation at night?)

JackKelly avatar Nov 12 '21 10:11 JackKelly

Yea, got to be careful that the zeros are not true. Could be done by looking at xcoords as 0 is not relastic there. another option is to change them back to nans, and then do some filling before passing to ML models

peterdudfield avatar Nov 12 '21 10:11 peterdudfield