dwave-system icon indicating copy to clipboard operation
dwave-system copied to clipboard

majority_vote introduces undocumented bias in sample set

Open jackraymond opened this issue 3 years ago • 1 comments

Description

dwave.embedding.chain_breaks.majority_vote() documentation error and feature request:

For chains assignments such as [-1 -1 +1 +1] or [0 0 1 1] there is no majority. This function assigns all such chains to a value +1. Current documentation states "for chains without a majority, an arbitrary value" , whilst this is not technically in error the statement should be modified for clarity. A user may otherwise expect this function to follow standard practice in the literature which is to break ties uniformly at random. Breaking ties uniformly at random may have some advantages as a default, or as a feature.

To Reproduce

Expected behavior The correct behaviour can be more clearly documented. "for chains without a majority, the value 1 is assigned" An option to allow either deterministic (as programmed), or uniform random, assignment would be ideal. With both options available, assuming no performance cost, uniform random would be a preferred default.

Environment:

  • OS: [Ubuntu 16.04.4 LTS]
  • Python version: [e.g. 3.7.0]

Additional context Providing an option to break ties randomly may be useful. If chain voted samples are to be used as part of applications that depend on the sampling distribution it is important that a bias between +1 and -1 is not introduced by this type of post-processing, especially since this subroutine forms part of the default path for working with embedding problems. On the other hand, amongst good samples for well embedded problems tie breaks should be a rare occurrence and have a small impact on distributional properties except for pathological models.

jackraymond avatar Mar 02 '21 22:03 jackraymond

❤️

davidmerwin avatar Mar 18 '22 00:03 davidmerwin