cram icon indicating copy to clipboard operation
cram copied to clipboard

Any suggestions for numerical tolerancing?

Open nigels-com opened this issue 7 years ago • 6 comments

We're using cram quite successfully for a collection of command line tools concerned with processing geospatial data. In some cases we match numerical data such as XYZ position, the distance between two locations, or the size of files written. I've started looking around for tools that might help in the case that we'd like to tolerate some relative or absolute variation without failing the test.

http://www.nongnu.org/numdiff/ numdiff needs a reference input file and seems to apply the same criteria to every number in the output.

http://www.math.utah.edu/~beebe/software/ndiff/ ndiff also needs a pair of input files

It seems to me that cram might support something similar to (re) suffix for the purpose of testing numerical aspects of the the matched output. Does anyone have a suitable solution for this kind of situation?

As an example, say we are monitoring a web service. We fetch a page and check HTTP headers, including the size. We know the size will vary, but if it's within some sane range, we ought to consider it a positive test result.

nigels-com avatar Nov 28 '16 09:11 nigels-com

My original thinking on this sort of thing is that if there's an existing command line tool that can do this, that'd be preferable to adding new syntax. That said, you aren't the first person to request this functionality, and as you've pointed out, there might not be a good tool for this that reads well in a Cram test.

I think I might be open to adding a new matching mode for this, but I haven't given it a ton of thought as to how it'd work, or how it would combine with the existing (re)/(glob) matchers. Do you have any ideas as to what kind of syntax might work well for you?

Another route we could go down is adding plugin support to Cram. I've seen this approach work well for Mercurial, which also supports extensions, but I do worry that perhaps allowing the syntax to be customized could lead to Cram tests being hard to decipher between code bases. Though maybe that wouldn't be a big deal if the extension support is purely limited to adding extra matchers. Thoughts?

Yet another thing we could do is actually make a completely separate tool for this sort of thing, totally standalone from Cram, and mention it in Cram's docs. That might be the most Unix-y way of solving this problem.

Anyway, do you have a preferred approach here? Any thoughts on what might work best for you?

aiiie avatar Nov 28 '16 21:11 aiiie

Yes, I'll continue browsing for seperate tools that can filter for numbers in text and reduce the number of significant digits, for example. Typically it's either one particular number that we'd like to range check, or a whole table of numbers that we'd need to deal with on a column-by-column basis.

nigels-com avatar Nov 28 '16 23:11 nigels-com

You could probably use the test command (look for it in your $PATH if you can't use the shell builtin) with its algebraic equality flags to compare numbers. If you have prose or other text mixed with numbers, you can use regex magic with something like sed or awk to isolate the numbers and pass them into test.

nickmccurdy avatar Nov 29 '16 01:11 nickmccurdy

Using awk / sed usually works, but I now need to add something like this mycommand | tr -d '\n' | awk '{print ($1-42.0)^2 < 0.01}' to every line, which isn't pretty. When dealing with many test cases it would be great if I could use some macro type syntax for comparing the output with a custom function.

r-chris avatar Nov 29 '16 01:11 r-chris

I can see some cram support for testing small amounts of numerical output mixed with text being very useful, as it allows the expected program output to exist as reference data in the test script. This, in contrast to piping through a regex and then into a separate program like numdiff. In the same way that (re) works, this would allow for fast debugging of the error output as it immediately focuses attention on the line containing the error.

For testing large numeric tables, I think a tool like numdiff is the right thing for the job: A failure in the middle of a 1000 line table of reference numeric data is unlikely to be meaningful in the cram output, and the user will need to dig deeper in any case. In general, numeric comparisons are truly tricky and require arbitrary computation which must be handed off to an external tool - eg, if you're comparing a matrix, which matrix norm are you using?

Anyway, for testing a small amount of mixed text and numeric data a cram builtin seems really useful. Could extra validation be somehow applied to the regex match groups? For example:

  $ echo 1.2345
  ((\d+.\d+)) (re isapprox($1, 1.23))

The idea here is that the (re) is extended by a predicate into which we interpolate the match groups ($1 here).

c42f avatar Nov 29 '16 02:11 c42f

By the way, julia's isapprox is a pretty good model for flexible numeric comparisons, see http://docs.julialang.org/en/release-0.5/stdlib/math/#Base.isapprox .

c42f avatar Nov 29 '16 02:11 c42f