nway icon indicating copy to clipboard operation
nway copied to clipboard

zero-size array error when including magnitudes

Open jerrybonnell opened this issue 3 years ago • 10 comments

Greetings,

We are trying to use nway to match some catalogs in a project we are working on. We would like to include magnitudes during the match, but nway raises a "ValueError: zero-size array" exception which is cryptic to understand and not informative of the problems that may be present in the input catalogs.

Following is the Python incantation we are using and the resulting output trace. We are also attaching the source files here so that the example is reproducible. In our exploration, the problem persists in cases where we can confirm counterparts in the secondary catalog and values in the magnitude columns (i.e., we are not sending in an empty column full of missing values).

We will appreciate any guidance the team can provide in helping us understand where we are going wrong.

nway_test.zip

$ python3 nway.py d14.fits 0.2 d25.fits 0.5 --out=tmp1.fits --radius 5 --mag D14:V_mag_nwayTemporary auto

NWAY arguments:
    catalogues:  d14.fits, d25.fits
    position errors/columns:  0.2, 0.5
      from catalogue "D14" (1591), density gives 2.19e+03 on entire sky
      from catalogue "D25" (16), density gives 8.25e+01 on entire sky
    magnitude columns:  D14:V_mag_nwayTemporary

matching with 5.000000 arcsec radius
matching:  25456 naive possibilities
matching: hashing
    using RA  columns: RA, RA
    using DEC columns: DEC, DEC
merging in 275 columns from input catalogues ...
100%|████████████████████████████████████████████████████| 275/275 [00:00<00:00, 706.38it/s]
    adding angular separation columns
matching:   1591 matches after filtering by search radius

Computing distance-based probabilities ...
  finding position error columns ...
    Position error for "D14": using fixed value 0.200000
    Position error for "D25": using fixed value 0.500000
  finding position columns ...
  building primary_id index ...
  computing probabilities ...
      correcting for unrelated associations ... not necessary

Incorporating magnitude biases ...
    magnitude bias "D14:V_mag_nwayTemporary" ...
    magnitude histogram of column "D14_V_mag_nwayTemporary": 1591 secure matches, 1591 insecure matches and 0 secure non-matches of 1591 total entries (1591 valid)
Traceback (most recent call last):
  File "../nway-master/nway.py", line 494, in <module>
    bins, hist_sel, hist_all = magnitudeweights.adaptive_histograms(mag_all[mask_others], mag_sel[mask_sel], weights=mag_sel_weights[mask_sel])
  File "/home/jbonnell/nway-master/nwaylib/magnitudeweights.py", line 100, in adaptive_histograms
    lo, hi = numpy.nanmin(mag_all), numpy.nanmax(mag_all)
  File "<__array_function__ internals>", line 6, in nanmin
  File "/home/jbonnell/.pyenv/versions/3.7.3/lib/python3.7/site-packages/numpy/lib/nanfunctions.py", line 319, in nanmin
    res = np.fmin.reduce(a, axis=axis, out=out, **kwargs)
ValueError: zero-size array to reduction operation fmin which has no identity

@alessandropeca @hscshane

jerrybonnell avatar Jun 01 '21 18:06 jerrybonnell

Your issue is here:

1591 secure matches, 1591 insecure matches and 0 secure non-matches

It did not find any secure non-matches that it could use for building a histogram of "field" sources.

I agree, the error message could be better.

JohannesBuchner avatar Jun 01 '21 18:06 JohannesBuchner

Okay I had the same issue and this makes sense, but I am still confused on what this means for including magnitude priors. Does this mean that based on our output from nway before including the priors, we do not need to do anything else? Our current output is fine?

b291c571 avatar Sep 20 '21 18:09 b291c571

Everything is fine in the way it is run. However, when running the match with just positional matches, you will see that none or very few matches are secure. If you want to use automatically built prior distributions, you need some secure matches to build that histogram.

Alternatively, you can still provide a prior file manually, or produced from a different run (this is described in the manual).

JohannesBuchner avatar Sep 20 '21 18:09 JohannesBuchner

How do we distinguish secure matches?

b291c571 avatar Sep 20 '21 19:09 b291c571

It is set by the --mag-auto-minprob parameter and defaults to 0.9. This threshold is applied to the dist_post output column.

JohannesBuchner avatar Sep 20 '21 19:09 JohannesBuchner

Hi, since the matching results are given as probabilities, why do we have to distinguish secure matches and non-matches? Could the probabilities be used as weights of candidates to construct the prior histograms?

sfzastro avatar Feb 25 '22 02:02 sfzastro

Yes, it would be possible to create histograms with the probabilities and 1-prob as weights.

However, objects which are unclear (probability ~ 0.5), would enter both. They are usually the largest number and would dilute the signal-to-noise and thus the distinguishing power when applying the histogram ratios.

JohannesBuchner avatar Feb 25 '22 08:02 JohannesBuchner

Do you think this can be put in an iterative scheme? It can start from the distance-based results, and in each iteration, the histograms are refined, eventually, the input prior will be identical to the posterior-weighted histograms, and the results converge in a self-consistent manner.

sfzastro avatar Feb 25 '22 17:02 sfzastro

If you have specific demands, you can use a position-based matching, and use the NWAY outputs to create a histogram, which you can then feed into a NWAY run.

Some people prefer building the magnitude prior from different data, because it should be prior information to the data at hand. Doing a iterative scheme may seem unconvincing to them.

JohannesBuchner avatar Feb 26 '22 09:02 JohannesBuchner

If you have specific demands, you can use a position-based matching, and use the NWAY outputs to create a histogram, which you can then feed into a NWAY run.

Thanks for the reply. I will try this.

Some people prefer building the magnitude prior from different data, because it should be prior information to the data at hand. Doing a iterative scheme may seem unconvincing to them.

I only meant the auto method, not that with a user-supplied prior.

sfzastro avatar Feb 26 '22 19:02 sfzastro