TreeCorr
TreeCorr copied to clipboard
Support SACC output format
LSST DESC is planning to use SACC files for passing around two-point data. Would be nice to have TreeCorr output to this format natively.
cf. https://github.com/LSSTDESC/sacc
Should we also do this for the DES 2pt data format?
If there is one other than merely fits catalogs, sure. Just point me to the data format you'd prefer. It would be easy to add it as an option.
I'm not sure how high a priority this should be, since we already output to sacc from the TXPipe pipeline using the APIs of the two codes. But in case it's useful, the sacc API is now stable can be used like this:
s = sacc.Sacc()
data_type = sacc.standard_types.galaxy_shear_xi_plus
# or one of:
# galaxy_shear_xi_minus, galaxy_shear_xi_imagPlus, galaxy_shear_xi_imagMinus
# galaxy_density_xi, galaxy_shearDensity_xi_t, galaxy_shearDensity_xi_x
# for each data point, create a window function object:
window = sacc.TopHatWindow(theta_min, theta_max)
# or if we have a weight function: window = sacc.Window(thetas, weights)
# and add to the data set. tracers_later=True means we don't have n(z) values yet.
s.add_data_point(sacc.standard_types.galaxy_density_xi, ('bin1', 'bin2'), xi_value,
tracers_later=True, theta=mean_theta_in_arcmin, window=window)
once all data points are added, save with
s.save_fits(filename)
Thanks Joe.
So a given SACC file only has one of xi+ or xi-? And no estimate of the variance? I'm a little surprised at that. I would have thought all of these would go together in one file.
Also, the TreeCorr bins are not (normally) tophats in theta space. They are tophats in log(theta) space. (For the typical case of bin_type='Log' anyway.) Is there a corresponding sacc function I should use for that?
Even better, if I understand the meaning correctly, we could even treat the bins as overlapping trapezoids (still in log space) to account for a non-zero bin_slop. Not sure if this is possible in SACC, or even if it would matter for any use cases, but it might be nice to get that right if it's an option.
Hi Mike,
So a given SACC file only has one of xi+ or xi-?
Didn't mean to imply that - you loop through data points, both within a type and across different types.
And no estimate of the variance?
Forgot to include that in my example - added below.
They are tophats in log(theta) space
Thanks for this - it's about two lines to add, so I can make a PR for that today.
treat the bins as overlapping trapezoids
Not 100% sure what you mean by that - the different windows can always overlap. Would this be covered by a general weight function? Or something more than that?
Here's an extended example:
s = sacc.Sacc()
xip = sacc.standard_types.galaxy_shear_xi_plus
xim = sacc.standard_types.galaxy_shear_xi_minus
bin_pairs = [('bin1', 'bin1'), ('bin2', 'bin2')] # etc
v = []
for bin_pair in bin_pairs:
# calculate theta_min, theta_max, xi_plus, xi_minus,
# varxip, varxim for this pair of bins, e.g. for 10 angular bins
...
for i in range(10):
# Window function - I will add a LogTopHat very shortly (trivial change)
window = sacc.TopHatWindow(theta_min[i], theta_max[i])
# add xiplus data point. Data point order is arbitrary, as long as the
# covariance order supplied later matches it. Order is changed to a
# canonical one when the file is saved
s.add_data_point(xip, bin_pair, xi_plus[i],
tracers_later=True, theta=theta_mean[i], window=window)
# build up variances for the end.
v.append(varxip[i])
# same for xi minus
s.add_data_point(xim, bin_pair, xi_minus[i],
tracers_later=True, theta=theta_mean[i], window=window)
v.append(varxim[i])
# sacc deduces from the fact that this is a 1D array that you are supplying the
# diagonal of the covariance matrix
v = np.array(v)
sacc.add_covariance(v)
# once all data points and covariances are added, save with:
s.save_fits(filename)
I've updated the code to add sacc.LogTopHatWindow - version 0.2.3 in pip.