mathics-core icon indicating copy to clipboard operation
mathics-core copied to clipboard

Memory exhaustion calculating very large integers

Open davidar opened this issue 1 year ago • 7 comments

Is your feature request related to a problem? Please describe. The following program from https://oeis.org/A000643 causes mathics to use over 10GB of RAM, and I end up having to kill it to avoid it from exhausting all the memory on my machine:

Off[General::ovfl];
a={1,0};
Flatten[Prepend[Table[s=Plus@@a;a=2^(RotateLeft[a]);a[[ -1]]=s,{n,8}],
                Table[0,{m,Length[a]-1}]]]

To be fair, the program attempts to calculate an integer with over 5 million digits (inside the list a), which is why I'm not calling this a bug. But at the same time WMA handles this program completely fine, and gives a result almost immediately (it just takes a long time if you try to print the final value of a, but doesn't appear to use any substantial amount of memory)

Describe the solution you'd like Better memory efficiency for arbitrary precision arithmetic.

Describe alternatives you've considered Don't attempt to calculate numbers that are that big ;)

davidar avatar Nov 03 '24 01:11 davidar

It seems that the problem appears when it tries to calculate SystemPower[2, 35184372088832]`

Power[2, 351843] is calculated without problems, but SystemPower[2, 3518430] crashes the evaluation. The problem happens at the level of the Python interpreter: 2**35184372088832 produces the same issue.

mmatera avatar Nov 03 '24 02:11 mmatera

Huh, I wonder where that power is coming from, it should only be calculating $2^{16777496}$, which python can handle fine (but just doesn't want to print by default):

>>> 2**16777496
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Exceeds the limit (4300 digits) for integer string conversion; use sys.set_int_max_str_digits() to increase the limit

The other power is too big for WMA even, but at least it prints an error message instead:

In[4]:= Power[2, 35184372088832]

General::nomem: The current computation was aborted because there was insufficient memory available to
    complete the computation.

Throw::sysexc: Uncaught SystemException returned to top level. Can be caught with Catch[…,
    _SystemException].

Out[4]= SystemException[MemoryAllocationFailure]

davidar avatar Nov 03 '24 03:11 davidar

I found where the evaluation stucks by wrapping the expression with TraceEvaluation.

The collapse of the evaluation does not happen if you use arbitrary precision Real numbers instead of integers. My guess is that WMA switch automatically the representation when it founds a so large number.

mmatera avatar Nov 03 '24 11:11 mmatera

@davidar, if in Python the computation works, and fails in Mathics, it could be also related to our cache/memoize mechanism. Maybe the problem is in trying to build a hash for such a large number. Something to check is to see what happens if we disable it.

mmatera avatar Nov 07 '24 22:11 mmatera

@davidar, if in Python the computation works, and fails in Mathics, it could be also related to our cache/memoize mechanism. Maybe the problem is in trying to build a hash for such a large number. Something to check is to see what happens if we disable it.

Interesting...

rocky avatar Nov 08 '24 10:11 rocky

I looked at this today. Basically, SymPy's Pow function can't handle the exponent it is given: 35184372088832

If you use TraceEvaluation, you'll see that exponent.

Try this from Python:

>>> from sympy.core.power import Pow
>>> sympy.core.power.Pow(2, 35184372088832)
[hangs here]

mpmath can raise to 2 to this power:

>>> from mpmath import power
>>> power(2, 35184372088832)
mpf('9.9528676175816035e+10591551377340')

but if you try to convert it into a MachineReal, this fails:

>>> from mathics.core.convert.mpmath import from_mpmath
>>> from_mpmath(power(2, 35184372088832))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    raise OverflowError
OverflowError
...

I am at a loss for what to do. Recently a change made by @mmatera made breaking the computation possible. Do you know how WMA does what it does?

I suppose we could forward this problem to the mpmath or SymPy community to see how they suggest how to handle?

rocky avatar Nov 17 '24 17:11 rocky

WMA can do its own checks about the size of a problem, and decide to limit the time and memory that the computation used.

When we send the computation to Sympy, we lose the control over the computation.

What I did to allow the use of TimeConstrain was to put that evaluation in a thread, and control each certain time if the evaluation continues. If the evaluation takes more time than the established, the thread is detached and the main evaluation loop continues. However, the Sympy computation continues in the second plane. Eventually, if it finishes after a certain time, the result is just discarded.

The problem is that even if we get disattached from the thread, the thread continues consuming resources, and for cases like the one in the example, eventually raises a core dump.

mmatera avatar Nov 17 '24 18:11 mmatera