hypothesis icon indicating copy to clipboard operation
hypothesis copied to clipboard

Pathological performance of decimals() with very large exponents or precision

Open Zac-HD opened this issue 8 years ago • 3 comments

After merging #789, which fixed several bugs affecting previous uses of the decimals() strategy, I started poking around at the behaviour for very large exponents.

  • Decimals with very large exponents (up to 10**18 - 1) are perfectly valid inputs to the decimals() strategy... but casting the bounds to integers or Fractions will consume a great deal of memory, hang for a very long time, and even if that works the Conjecture buffer is then too small to produce an example.

  • [ ] We can take advantage of the floating-point nature of Decimals: cancel out exponents of the bounding values before we construct the underlying strategy, and adjust the drawn value accordingly. This is fiddly but doable, if we careful about the precision context and places argument.

  • The decimal precision context can be set so high that operations simply never complete - as in, would require 10**18 - 1 digits. Try Context(prec=MAX_PREC).divide(1, 3) as an example, but be prepared to kill the process 😉

  • [ ] I think it is reasonable to limit the precision of the bounding values to 10**4 digits each. If more is desired, the error can direct users to the fractions() strategy for truly unlimited-precision arithmetic. Internally, we can then clip precision to at most 10**4 digits without risking bounds errors, if we also choose a rounding mode that can't take us outside of the bounds (also fiddly but possible).

Motivating example: decimals('10E9999999999999998', '10E9999999999999999').example() hangs forever.

  • [ ] When casting string arguments to decimal, we should use a context that traps InvalidOperation and reraises it as InvalidArgument with a useful message. (See #814 for discussion)

Zac-HD avatar Sep 07 '17 01:09 Zac-HD

This also includes decimals with very small exponents. On my machine,

from decimal import Decimal
from hypothesis.strategies import decimals

decimals(max_value=Decimal(0).next_minus()).example()

takes minutes (on Python 3.5.2, hyphotesis 3.25.0) and returns a value with (literally) a million decimal places. (I had expected getcontext().prec many.)

(I am trying to generate decimals from half-open intervals and thought that this way I might get around filtering the bounds.)

lmshk avatar Sep 12 '17 13:09 lmshk

Yeesh, that's the bug alright 😞. Thanks for the report and I'm sorry you ran into it!

The immediate cause is that Decimal(0).next_minus() gives Decimal('-1E-1000026'), which certainly surprised me. FWIW I think of 'negative one-and-a-bit million' as a large but negative number, and the correct behaviour (adjust exponent, take advantage of fact it's a floating-point format) is the same as above.

As a workaround, I suggest picking a small bound and using that, eg Decimal(10) ** -30, optionally with the places= argument if you have a view on that.

Zac-HD avatar Sep 12 '17 13:09 Zac-HD

Filtering the bounds is also a perfectly reasonable thing to do I think - filter is only really problematic if the event is common, and hitting exact bounds shouldn't be.

DRMacIver avatar Sep 12 '17 13:09 DRMacIver

...since nobody seems to have needed this in the, uh, six-and-a-half years since I opened the issue, I'm going to close it again 😅

Zac-HD avatar Mar 12 '24 08:03 Zac-HD