Size of subset files too large (`salem.roi`)
Hello there! I have been trying to use salem.roi() to subset the netCDF data using a shapefile. However, I have noticed that the file size remains the same despite selecting a smaller area, and it is also taking a significantly longer time to run. Can anyone offer any assistance or advice on this matter? Your help would be greatly appreciated.
Check the code below:
for shapefile in shapefiles:
# open the shapefile
county = salem.read_shapefile(shapefile)
# subset the netcdf data for the county
county_data = data.salem.roi(shape=county)
file_name = os.path.basename(shapefile)
# save county netcdf file
county_data.to_netcdf(save_path + "/" + file_name[:-4] + "_river_discharge.nc")
# close the netcdf file
county_data.close()
Hello, if you want to subset the data to should use subset, not ROI. ROI is masking the data. See:
https://salem.readthedocs.io/en/stable/xarray_acc.html#subsetting-data https://salem.readthedocs.io/en/stable/xarray_acc.html#regions-of-interest
@fmaussion Thank you. I will try the subset function.
@fmaussion I tried it and it appears to be working, but I am getting data that falls outside the polygon of the shapefile. Is there a way to remove this data that exceeds the boundary of the shapefile? I want to select the points on land, instead of those on the sea. Any help would be appreciated.
See this figure:
@javedali99 yes: subset first to make the domain smaller, then mask with region of interest (ROI). This is explained here: https://salem.readthedocs.io/en/stable/xarray_acc.html#subsetting-data