pyuvdata
pyuvdata copied to clipboard
Select on read for MWA correlator fits
We've got a project that is hitting a bottleneck due to the lack of select on read for MWA correlator fits files. We should discuss the feasibility of implementing it (and if there is even a potential speedup given the nature of the file format).
I think there are potential options, but it depends a bit on what axes. Some axes, like time, frequency and polarization, are easier than e.g. baseline.
It’s baseline that we need…
Somewhat tangential, but there may be some low-hanging fruit in terms of file read speed-ups:
coarse_chan_data[time_ind, :, :] = (
hdu.data[:, 0::2] + 1j * hdu.data[:, 1::2]
)
This looks a lot like some old code that used to be in the UVH5 reader, which you can get a reasonable speedup with using ndarray.view
to read as a complex dtype (could be as much as a factor of 2, though given that you go through by time index I doubt it'll be quite that good, but it's a quick one-line edit).
It’s baseline that we need…
Do you want to select a large or small number of baselines?
Small, in the sense of less than half of the baselines. Maybe a quarter tops?