Transition tolerance specification to log domain
Describe the bug
The range of positive JavaScript number values limits the tolerance parameters for equality testing to roughly 1e-320, whereas when computing with high-precision BigNumbers, a smaller tolerance would be useful. For example, the available digits for some transcendental functions are limited by the tolerance rather than precision, since the iterative algorithms can no longer function once the incremental changes are within the equality tolerance.
To Reproduce
math.config({relTol: 1e-500}) -- oops, results in relative tolerance 0, which is a valid setting but a very different behavior than an actual relative tolerance of 1e-500.
Discussion
The recommended resolution, which has been discussed before although I can't find where, is to transition to two new configuration parameters, which would simply give the digits of tolerance (i.e. be -log_10 of the current values). To avoid breaking change, setting relTol and absTol could still work, with a deprecation warning, much like epsilon, which should then be retired altogether in the next breaking change.
All that's really needed to move forward is a choice of the new parameter names:
- absoluteDigits and relativeDigits?
- absTolDigits and relTolDigits?
- logAbsTol and logRelTol?
None of these are super euphonious. @josdejong should just pick something, one of these or something better he thinks of.
Some more ideas:
- We could also extend the existing parameters with a new notation, like supporting both
{ relTol: 1e-320 }and something like{ relTol: {digits: 320} }. - We could add support for BigNumber, like
{ relTol: math.bignumber('1e-320') }.
It occurred to me that it's also worth thinking about the use side of the parameter. It would be best if the code that needs to respect the tolerances can directly obtain the information necessary for the calculations, without having to repetitively perform some sort of translation or taking of cases, etc. After all, a tolerance is typically used many times more often than it is set. So any sort of canonicalization should occur at setting time, rather than using time.
To me the logical consequences of this principle are:
- Whatever config parameter will be read in the using code is the most semantically transparent for clients of the library to set. We can certainly provide convenience and/or backwards-compatibility setting options, but I would recommend also making it possible to set the tolerances directly as used.
- The config parameter for use in code should be an integer plain JavaScript number, call it
T, with the corresponding tolerance being $10^{-T}$. Such a value is uniform in type, and reasonably easy to use for any data type (basically take the exponent of the small value and compare it to the parameter), and covers all possible use cases because the max integer number is way larger than any practical number of digits of tolerance. - Since that is a different semantics than the current parameter for use in code, it should have a different name, just for safety/clarity in transition.
Therefore, I'd strengthen my opinion to a recommendation that we just go with a pair of new names, and translate the old names/values to the new ones as a backwards-compatibility measure. If you agree, then please just pick the new names. If I have not persuaded you and you would like to keep the old names but with different values, please just specify which of the new notations you would like, and what should be set for the sake of the using code, and we can make it so :). Thanks!
(In case it wasn't clear, to me the above considerations weigh against allowing the tolerances to be a BigNumber, because then number comparisons, which will inevitably happen even when BigNumbers are primarily in use, will be slowed by the need to convert and/or compare BigNumbers to numbers.)
@nycos62 asks in #3532:
just a remark but why relTol and absTol are not computed directly from precision when you specify a precision ?
{ relTol: { digits: precision-4} }
{ absTol: { digits: precision-1} }
I think the main reason is that config.precision affects only BigNumbers whereas relTol and absTol affect all computations. So note, for example, when one is operating with BigNumbers at high precision and also sets relTol to a value over 17, then two ordinary numbers will generally only be considered equal when they have exactly the same bit representation, ~~and two Fractions will only be considered equal when the denominator of the difference has more than that many digits,~~ etc. CORRECTION: Fractions ignore tolerance and are only considered equal when they are precisely equal. Is that what we want? If so, it's not consistent with the top line documentation for equal -- it says "tests whether two scalar values are nearly equal". So also as part of acting on this issue, we should either have Fraction respect tolerance, or we should change the top documentation of equal to say "tests whether two scalar values are equal, or nearly equal in the case of floating-point representations of numbers." I think the rationale is that Fraction computations can in most cases reasonably be considered exact, so therefore comparisons should also be exact, and so we should just change the documentation to match the actual behavior.
And so I will also raise again here the suggestion that has been made elsewhere: each numeric type should have its own independent tolerances, which would default to some global tolerance. In that world, setting precision should quite reasonably default to setting BigNumber tolerances if they are not otherwise set.
I am not necessarily advocating for or against per-type tolerances; they clearly have some advantages, but they also significantly complicate the configuration system. But likely we should decide on the question of per-type tolerances, at least for the time being, before acting on this issue.
Good discussion, thanks.
-
One more idea (maybe a bad idea): defining a tolerance as an integer with just the exponent like
320results in numeric values far out of range of the current configuration like1e-320. So: we could keep using the same configuration names (relTolandabsTol), and just check if the value is an integer. If not, assume that it is the old notation like1e-320and extract the exponent. -
It would be best if the code that needs to respect the tolerances can directly obtain the information necessary for the calculations, without having to repetitively perform some sort of translation or taking of cases
Yes, totally agree! Looking at the usages of
config.relTolandconfig.absTolI have the feeling that the current definition matches how the values are used throughout the codebase. Will specifing the tolerances as just their exponent (like 320 instead of1e-320) require more conversions before it can be used? If needed we could separate the way we publicly configure values for abs/rel tolerance from some internal representation but that may work counterintuitive. -
Agree, Fraction should respect tolerences too.
-
Indeed, the different data types should ideally have their own configuration, so you can configure a precision of about 12 digits for numbers and say 60 digits for BigNumbers. I think we discussed that idea somewhere already.
-
All in all, I think we could cater for some some flexible notation that allows for example configuring tolerences and precision per data type, but also a convenience notation that allows to set one tolerance for all data types if that is what you need. And maybe a way to configure tolerances relative to the configured precision.
- I am not against having absTol and relTol be integers but for backwards compatibility allow them to be specified as values between 0 and 1 in which case at setting time the -log10 is taken. If you want to choose that convention, let's go with it.
- I don't think integers are bad to use. It's easy to extract the integer exponent part from numbers and bignumbers (much cheaper than a log) and then you just compare -- you don't care about the mantissa, since we are only allowing tolerances in powers of 10. Well oops the exponent in numbers is base 2, not 10, but we have a log2 constant to convert. The natural tolerances for number would be base 2 but I don't think it's worth complicating the system that much to have different bases for different types. For fractions you will need to raise 10 to the specified power.
- I agree for consistency it should be possible to have tolerances for fractions, but I do think typically one wants zero tolerance since rational-number computation is inherently exact. So I think the only alternative if we go down that path is per-type tolerances.
- Ok time to bite that bullet.
- Will think on what would be a convenient notation.
Ok notation proposal: math.config('number.relTol': 15, bignumber: {absTol: 49, precision: 50}). In other words, you can either put a dot in the key, or use a subobject as the value with the type name as key. Use type name any to set for all types. And for backwards compatibility, top-level "precision" sets "bignumber.precision" and top level "relTol" and "absTol" set all types except fraction and bigint, since you likely want those exact. Note this system should allow e.g. 'bigint.absTol': -6 to only compare bigints to within a million.
Thoughts?
Oh as far as setting tolerances relative to precision, we could i suppose make setting top-level "precision" a convenience for setting bigint precision, and bigint reltol to a few less and bigint abstol to one less. Not sure if its worth it, and it would make this a breaking change, whereas the scheme immediately above is actually non-breaking. So i would recommend just being explicit, personally.
- Yeah I like the idea of re-using the existing
absTolandrelTol. We can keep it backward compatible and log a deprecation warning. - I expect both approaches are straightforward, just a little different. I don't expect huge differences in performance or anything, so I think we can just use numbers rather than -log10 and see how it goes.
- :thumbsup:
- :sweat_smile:
(5) About the notation:
I'm personally not a fan of supporting {'number.relTol': 15} as a convenience of {number: {relTol: 15}}, to me the latter is more straightforward, unubiguous (prevents ambiguous cases like {'number.relTol': 15, number: {relTol: 12}}), easier expandable (like if later on you want to specify a nested absTol too), and also easier to implement (no need for analyzing and splitting keys into nested objects).
It makes sense to specify the data type as root key. We need to think though how to go about the existing number config (which can be 'number', 'BigNumber', 'bigint', or 'Fraction'). Either we change that config too some way. Or we could put all precision and tolerance related config under a single config. I'm not sure if it is needed to rethink the existing precision option which only applies to BigNumbers. Also, I think it makes sense to keep the casing of the name "BigNumber" consistent with the value it has in the config option number. Some ideas:
// Idea 1
// apply to all data types
const config1 = {
number: 'BigNumber',
precision: 12,
relTol: 15,
absTol: 12
}
// apply different tolerances per data type
const config3 = {
number: 'BigNumber',
precision: 64,
relTol: { number: 12, BigNumber: 64 },
absTol: { number: 15, BigNumber: 62 }
}
Other appraoch:
// Idea 2
// apply to all data types
const config2 = {
number: 'BigNumber',
precision: 64,
tol: {
rel: 64,
abs: 62
}
}
// apply different tolerances per data type
const config2 = {
number: 'BigNumber',
precision: 64,
tol: {
number: { rel: 15, abs: 12 },
BigNumber: { rel: 64, abs: 62 }
}
}
The "Idea 1" example looks a bit odd, it feels better to me to group relTol and absTol together per data type like config2 and your proposal. On the other hand, if we do not expect more items to be it may be acceptable. It is quite close to the existing config which may make it easier to implement.
In case of Idea 2, we could move precision inside the nested tol.BigNumber config. I'm not sure though if that would help.
For my part I am not a fan of triple nesting in Idea 2, but I definitely prefer the top key to be the type, with the attributes (relTol, absTol, and for BigNumber, precision) as the inner keys. The only impediment is number being taken. Possible ways around this:
- I think we have already discussed moving config.number to config.parse.number (see #3420). That would make a lot of sense to me, since it is essentially all about parsing. And it would free up config.number to be an object of configuration options for the
numbertype. But it would be a very noticeable breaking change. - just punt to a slightly inconsistent set of keys:
{BigNumber: {relTol: 48}, Number: {absTol: 14}}even though the official mathjs name of the type for javascript numbers isnumber, notNumber. Or instead ofNumber,numberType. - When
numberis set to a string, interpret it as currently and issue a warning. When set to an object, interpret absTol and relTol keys in that object as discussed. And just like the BigNumber configuration subobject will have an extra keyprecisionto control the precision used, thenumberconfiguration object will have extra keysreadTypeandreadFallbackorparseTypeandparseFallbackor something like that, controlling the behavior when strings are read in as numeric values.
(You can also see my strong desire to move precision into BigNumber as I think that will remove significant possible confusion that somehow precision controls mathjs as a whole, rather than just computations with BigNumber.)