pyuvdata icon indicating copy to clipboard operation
pyuvdata copied to clipboard

Partial Reading for Beamfits

Open steven-murray opened this issue 3 years ago • 2 comments

Description

I added three parameters to the read_beamfits method: freq_range, az_range and za_range. Each should be a 2-tuple of floats specifying the range to read in. This is meant to reduce peak memory usage. It is enabled by the FITS memmap feature.

Motivation and Context

Closes #1201

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation change (documentation changes only)
  • [ ] Version change
  • [ ] Build or continuous integration change

Checklist:

  • [x] I have read the contribution guide.
  • [x] My code follows the code style of this project.

New feature checklist:

  • [x] I have added or updated the docstrings associated with my feature using the numpy docstring format.
  • [ ] I have updated the tutorial to highlight my new feature (if appropriate).
  • [x] I have added tests to cover my new feature.
  • [x] All new and existing tests pass.
  • [x] I have updated the CHANGELOG.

steven-murray avatar Aug 25 '22 21:08 steven-murray

@bhazelton and @mkolopanis : this is essentially working (I will write some tests soon). My non-unit-test way of testing this was to run the following script:

import os
import numpy as np
from pyuvdata import UVBeam
from pyuvdata.data import DATA_PATH


filenames = ["HERA_NicCST_150MHz.txt", "HERA_NicCST_123MHz.txt"]
cst_folder = "NicCSTbeams"
cst_files = [os.path.join(DATA_PATH, cst_folder, f) for f in filenames] * 5

def make_cst_beam(beam_type):
    """Make the default CST testing beam."""
    extra_keywords = {
        "software": "CST 2016",
        "sim_type": "E-farfield",
        "layout": "1 antenna",
        "port_num": 1,
    }

    beam = UVBeam()
    beam.read_cst_beam(
        cst_files,
        beam_type=beam_type,
        frequency=np.linspace(120e6, 170e6, 10),
        telescope_name="HERA",
        feed_name="Dipole",
        feed_version="1.0",
        feed_pol=["x"],
        model_name="Dipole - Rigging height 4.9 m",
        model_version="1.0",
        x_orientation="east",
        reference_impedance=100,
        history=(
            "Derived from https://github.com/Nicolas-Fagnoni/Simulations."
            "\nOnly 2 files included to keep test data volume low."
        ),
        extra_keywords=extra_keywords,
    )
    return beam


if __name__=="__main__":
    from pathlib import Path

    fname = Path("tempbeam.fits")

    if not fname.exists():
        beam = make_cst_beam('efield')
        beam.write_beamfits("tempbeam.fits")
        del beam
    
    uvb= UVBeam()
    uvb.read_beamfits(
        'tempbeam.fits',  
#         za_range=(0, 95), 
#         freq_range=(120e6, 145e6),
       freq_range=(0, np.inf),       
    )

(Notice that the make_cst_beam function is just out of the conftest.py). Then I switched the last line to pass different combinations of za_range and freq_range. I ran the script through memory-profiler. I'm attaching the (redacted) outputs:

freq_selector.txt full_selector.txt no_selector.txt za_selector.txt

The things to notice are:

  1. If you specify the ranges such that everything gets read in, there's no difference to the memory compared to not specifying any ranges.
  2. The reason the downselect statements don't take any memory is because data is a memmap array, and we're doing simple indexing. This relies on the regular-spaced frequencies/az/za that beamfits requires.
  3. There is something a little weird (not particular to this PR) -- for the full array, the size should be about 39 MB, however it seems to be using about double that memory when creating the complex array. I presume this is because its loading the memory into data, and then creating the complex array, then deleting data. So the peak memory is ~twice the original data. I feel sure this is something that can be gotten around...

steven-murray avatar Aug 25 '22 22:08 steven-murray

Codecov Report

Merging #1205 (6ff1e57) into main (5a6b3b6) will increase coverage by 0.00%. The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1205   +/-   ##
=======================================
  Coverage   99.90%   99.90%           
=======================================
  Files          32       32           
  Lines       17358    17405   +47     
=======================================
+ Hits        17342    17389   +47     
  Misses         16       16           
Impacted Files Coverage Δ
pyuvdata/uvbeam/uvbeam.py 99.62% <ø> (ø)
pyuvdata/uvbeam/beamfits.py 100.00% <100.00%> (ø)
pyuvdata/uvdata/uvdata.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 5a6b3b6...6ff1e57. Read the comment docs.

codecov[bot] avatar Aug 31 '22 08:08 codecov[bot]

@mkolopanis and @bhazelton I believe this PR is ready.

steven-murray avatar Sep 07 '22 21:09 steven-murray