pyuvdata icon indicating copy to clipboard operation
pyuvdata copied to clipboard

Add support for CASA-based MS calibration tables

Open kartographer opened this issue 1 year ago • 14 comments

Adds the MSCal class, and other related features to UVCal.

Description

~NOTE: This PR is an early draft and still WIP -- internal consistency within pyuvdata has been checked, but CASA testing has been much more limited. Comments/feedback are very much welcome!~

At this point, the PR has advanced out of draft stage, and is ready for "full" review (though I think I've already gotten a good number of helpful comments thusfar). So far the code has been tested against SMA and LWA data, and I think is ready for a sort of "beta" release. There's a couple of outstanding items to take care of, which are on the checklist below (and will be ticked off upon completion accordingly). Comments/feedback still very much welcome.


The MSCal class -- a subclass of UVCal -- has been added, which enables read/write support for MS calibration tables with sky-based calibration schemes (gains, bandpass, delays). To support full loopback of the information with the MS tables, handling of phase center information (and the associated parameters Nphase, phase_center_catalog, and phase_center_id_array) has been added, along with the ability to create "flex-Jones" objects, which mimics the UVData flex-pol functionality (support for which required the addition of the optional parameter flex_jones_array). Additional optional parameters also include scan_number_array and antenna_diameters -- all of the above mimicking their UVData counterparts.

One additional parameter has been added -- ref_antenna_array -- which allows for the reference antenna to vary across the time-axis of a UVCal object. The MS format supports such variation, and given that there's no reason one has to keep the same reference antenna across a gains solution, I decided to add support for it to provide better preservation of information when reading/writing MS gains table.

SMA-based MS calibration files have also been added for testing, which will also eventually got folded in the relevant tutorial. Some additional clean-up has also been done in the data folder of the repository to reduce the overall footprint for packaging (tarball was ~65 MB, now appears to be ~45 MB from tests on my laptop).

Separately, a bug affecting UVBase.__eq__ has been fixed, where allowed_failures was being ignored if a parameter was required (i.e., non-optional). An additional bug which affected how select, __add__, and fast_concat dealt with identifying desired or overlapping frequency channels, which only affected UVCal objects where flex_spw=True that had more than one spectral window.

Motivation and Context

Adds support for reading/writing UVCal objects into Measurement Set format, so that they may be used within/read from various CASA routines.

Closes #143 Closes #1016 Closes #1202 Closes #1407

Types of changes

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)

Checklist:

  • [x] I have read the contribution guide.
  • [x] My code follows the code style of this project.

Bug fix checklist:

  • [x] My fix includes a new test that breaks as a result of the bug (if possible).
  • [x] All new and existing tests pass.
  • [x] I have updated the CHANGELOG.

New feature checklist:

  • [x] I have added or updated the docstrings associated with my feature using the numpy docstring format.
  • [x] I have updated the tutorial to highlight my new feature (if appropriate).
  • [x] I have added tests to cover my new feature.
  • [x] All new and existing tests pass.
  • [x] I have updated the CHANGELOG.

kartographer avatar Feb 05 '24 20:02 kartographer

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.93%. Comparing base (8355244) to head (332efce).

Additional details and impacted files
@@             Coverage Diff             @@
##           prep_v3.0    #1391    +/-   ##
===========================================
  Coverage      99.92%   99.93%            
===========================================
  Files             39       41     +2     
  Lines          21488    22390   +902     
===========================================
+ Hits           21472    22375   +903     
+ Misses            16       15     -1     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Feb 10 '24 16:02 codecov[bot]

I'm testing this branch with some calibration solutions I'm working with. I have a CASA-generated *.bcal file and a pyuvdata-generated *.calfits file.

  1. Reading the *.bcal file with cal_orig.read_ms_file(path/to/file.bcal) worked, and cal_orig.check() passes.
  2. I read the *.calfits file with cal.read_calfits(path/to/file.calfits), and cal.check() passes.
  3. I tried writing the cal object with cal.write_ms_cal(path) and got this error:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/pyuvdata/uvcal/uvcal.py", line 6440, in write_ms_cal
    ms_cal_obj.write_ms_cal(filename, **kwargs)
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/pyuvdata/uvcal/ms_cal.py", line 410, in write_ms_cal
    time_array = (Time(time_array, format="jd", scale="utc").mjd * 86400.0)[
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/astropy/time/core.py", line 1580, in __init__
    self._init_from_vals(val, val2, format, scale, copy,
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/astropy/time/core.py", line 433, in _init_from_vals
    self._time = self._get_time_fmt(val, val2, format, scale,
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/astropy/time/core.py", line 499, in _get_time_fmt
    raise ValueError(
ValueError: Input values did not match the format class jd:
TypeError: for jd class, input should be doubles, string, or Decimal, and second values are only allowed for doubles.
  1. I set cal.time_array = np.array([np.mean(cal_orig.time_array)]) and cal.lst_array = np.array([np.mean(cal_orig.lst_array)]) (.calfits file only has one time step) and tried rerunning cal.write_ms_cal(path) and got this error:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/pyuvdata/uvcal/uvcal.py", line 6440, in write_ms_cal
    ms_cal_obj.write_ms_cal(filename, **kwargs)
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/pyuvdata/uvcal/ms_cal.py", line 446, in write_ms_cal
    self.antenna_names.index(self.ref_antenna_name)
AttributeError: 'numpy.ndarray' object has no attribute 'index'
  1. I set cal.antenna_names = list(cal.antenna_names) and confirmed that cal.check() succeeds. I then reran cal.write_ms_cal(path) and got this error:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/pyuvdata/uvcal/uvcal.py", line 6440, in write_ms_cal
    ms_cal_obj.write_ms_cal(filename, **kwargs)
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/pyuvdata/uvcal/ms_cal.py", line 532, in write_ms_cal
    subarr = np.transpose(subarr, [2, 0, 1, 3])
  File "<__array_function__ internals>", line 200, in transpose
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 668, in transpose
    return _wrapfunc(a, 'transpose', axes)
  File "/opt/devel/rbyrne/envs/pyuvdata/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
ValueError: axes don't match array

Let me know if you want the files I used for this test.

rlbyrne avatar Mar 03 '24 13:03 rlbyrne

Thanks @rlbyrne -- yes, if you could send along the files, it'd be helpful for trying to chase down the issues you were running into.

kartographer avatar Mar 03 '24 21:03 kartographer

Thanks @rlbyrne -- yes, if you could send along the files, it'd be helpful for trying to chase down the issues you were running into.

Just sent them via Slack

rlbyrne avatar Mar 03 '24 21:03 rlbyrne

@rlbyrne -- looks like that final issue are running into is because I'm assuming future array shapes without explicitly checking. I'll fix that in the code, but I think if you run the UVCal.use_future_array_shapes method it should hopefully work.

kartographer avatar Mar 03 '24 22:03 kartographer

Recording this here for the sake of documentation -- I've at least had one successful full test of exporting SMA-based solutions into CASA and applying them via applycal. Both phases and amplitudes look as expected and are in agreement with what our internal pipeline has generated. There seems to be a slightly unexpected "extra" conjugation required for the gains (that effectively cancels out that normally seen reading in an MS file, and that I can reasonable understand without too much tortured logic), although it may be helpful to verify that the same behavior is seen in delay solns as well. Given this successful test, this PR is looking like it's ready to advance out of draft status...

Screenshot 2024-03-19 at 06 57 42 Screenshot 2024-03-19 at 06 57 22

kartographer avatar Mar 19 '24 14:03 kartographer

@rlbyrne -- looks like that final issue are running into is because I'm assuming future array shapes without explicitly checking. I'll fix that in the code, but I think if you run the UVCal.use_future_array_shapes method it should hopefully work.

@kartographer I'm just getting back to this after travel... I used write_ms_cal after running use_future_array_shapes . The function did produce a *.bcal file, but it looks like there's a bug in the path parsing. In addition to the intended directory (cal46_newcal_rewritten.bcal), it made 5 other directories in the same location called cal46_newcal_rewritten.bcal::ANTENNA, cal46_newcal_rewritten.bcal::FIELD, cal46_newcal_rewritten.bcal::HISTORY, cal46_newcal_rewritten.bcal::OBSERVATION, and cal46_newcal_rewritten.bcal::SPECTRAL_WINDOW. I assume the "::" should just be "/" and that would fix this?

rlbyrne avatar Mar 22 '24 21:03 rlbyrne

@kartographer I'm just getting back to this after travel... I used write_ms_cal after running use_future_array_shapes . The function did produce a *.bcal file, but it looks like there's a bug in the path parsing. In addition to the intended directory (cal46_newcal_rewritten.bcal), it made 5 other directories in the same location called cal46_newcal_rewritten.bcal::ANTENNA, cal46_newcal_rewritten.bcal::FIELD, cal46_newcal_rewritten.bcal::HISTORY, cal46_newcal_rewritten.bcal::OBSERVATION, and cal46_newcal_rewritten.bcal::SPECTRAL_WINDOW. I assume the "::" should just be "/" and that would fix this?

Hey @rlbyrne -- I do remember seeing this error at some point, but I believe I've fixed in the last couple of weeks. Trying with both files you sent (which now should alert you when you aren't using future array shapes), I seem to get sensible looking files. Could I ask you to delete/re-pull the branch and try again to see if you see the same behavior?

kartographer avatar Mar 25 '24 13:03 kartographer

@kartographer I'm still getting the error. I'm on branch casa_gains and it is up to date.

rlbyrne avatar Mar 25 '24 22:03 rlbyrne

@rlbyrne -- can you check what version of casacore you have installed? Just want to rule out an external API issue...

kartographer avatar Mar 26 '24 15:03 kartographer

@kartographer I'm running casacore 3.4.0

rlbyrne avatar Mar 26 '24 21:03 rlbyrne

@rlbyrne -- ah, okay, that may explain what's going on (and is resonating with my prior recollection now). I had to up the requirement for casacore to 3.5.0, which has a change in the API which interprets the '::' as being a marker for a sub-table, which appropriately indexes the information in the main MS table (there are alternative ways of linking the tables, but it requires a pretty lengthy workaround per table, which gets to be particularly obnoxious with the gains tables). Can you try upgrading and see if the problem persists?

kartographer avatar Mar 26 '24 21:03 kartographer

@kartographer Ok that seems to have worked! Thank you.

rlbyrne avatar Mar 26 '24 22:03 rlbyrne

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.92%. Comparing base (a27a89b) to head (70f7fed).

Additional details and impacted files
@@             Coverage Diff             @@
##           prep_v3.0    #1391    +/-   ##
===========================================
  Coverage      99.92%   99.92%            
===========================================
  Files             39       41     +2     
  Lines          21486    22399   +913     
===========================================
+ Hits           21470    22383   +913     
  Misses            16       16            

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Apr 01 '24 19:04 codecov[bot]