Documenter.jl doctest filters have to match both output and expected output

Here is an MWE:

using Documenter

module MWE
"""

```jldoctest; filter = r"#"
julia> 2
2    # 1 + 1
```
"""
function dummy end
end

doctest(nothing, [MWE])

I would expect this to pass, but it doesn't. Doing ENV["JULIA_DEBUG"] = "Documenter" reveals that the filter is not (?) applied:

│ ```jldoctest; filter = r"#"
│ julia> 2
│ 2    # 1 + 1
│ ```
│
│ Subexpression:
│
│ 2
│
│ Evaluated output:
│
│ 2
│
│ Expected output:
│
│ 2    # 1 + 1
│
│ 1 doctest filter was applied:
│
│   r"#"
│
│ Evaluated output (filtered):
│
│ 2    # 1 + 1
│
│ Expected output (filtered):
│
│ 2    # 1 + 1

(note the confusing "Evaluated output (filtered)", fixed by PR #2276)

Sep 21 '23 10:09 fingolfin

For context, this comes from https://github.com/JuliaMath/Primes.jl/pull/140

Sep 21 '23 10:09 fingolfin

OK maybe this is a hint: if I use r"2" then this "works":

┌ Debug: Verifying doctest at REPL[17]
│
│ ```jldoctest; filter = r"2"
│ julia> 2
│ 2    # 1 + 1
│ ```
│
│ Subexpression:
│
│ 2
│
│ Evaluated output:
│
│ 2
│
│ Expected output:
│
│ 2    # 1 + 1
│
│ 1 doctest filter was applied:
│
│   r"2"
│
│ Evaluated output (filtered):
│
│     # 1 + 1
│
│ Expected output (filtered):
│
│     # 1 + 1
└ @ Documenter ~/.julia/packages/Documenter/Meee1/src/doctests.jl:329

But if I use r"1" then it also doesn't work as I'd expect it:

┌ Debug: Verifying doctest at REPL[20]
│
│ ```jldoctest; filter = r"1"
│ julia> 2
│ 2    # 1 + 1
│ ```
│
│ Subexpression:
│
│ 2
│
│ Evaluated output:
│
│ 2
│
│ Expected output:
│
│ 2    # 1 + 1
│
│ 1 doctest filter was applied:
│
│   r"1"
│
│ Evaluated output (filtered):
│
│ 2    # 1 + 1
│
│ Expected output (filtered):
│
│ 2    # 1 + 1
└ @ Documenter ~/.julia/packages/Documenter/Meee1/src/doctests.jl:329

Sep 21 '23 10:09 fingolfin

So, this is non-obvious, but caused by the all here:

https://github.com/JuliaDocs/Documenter.jl/blob/e50da5d0744d6bc57e48cb7fc494d2701638fd9a/src/doctests.jl#L283

Both output and expected output have to match before filtering is applied. I am unsure of the usecase, but it was intentionally added as a bugfix: https://github.com/JuliaDocs/Documenter.jl/pull/588

Minimally, this should be documented though.

Sep 21 '23 11:09 mortenpi

So I was thinking about this a bit. We have 4 cases that can happen when we apply a filter to a doctest:

Both evaluated output and expected output match.
Neither the evaluated output nor expected output match.
Evaluated output matches, but expected output doesn't.
Expected output matches, but evaluated output doesn't.

For (1) we apply the filtering, and everyone is happy. For (2), the filtering would have no effect anyway, so it doesn't really matter if we apply it or not. The question is whether we should also apply the filter in cases (3) and (4). Currently we don't.

Let's suppose we have output like FOO:1234, with the number being random/changing, so we want to filter it with r"[0-9]+".

In case (3), where we would filter when the evaluated output matches the filter, but reference doesn't, you would run into the case where following doctest would pass:

# Expected output:
FOO:
# Evaluated output:
FOO:1234

This is not what the user would expect -- they clearly expect the part after FOO: to be empty. So we should not filter here.

One argument here could be then that, since the expected output doesn't have the pattern anyway, then why are you even filtering this? But this case can happen when you use global doctest filters (e.g. DocTestFilter in at-meta) -- you will be applying filters to doctests where you don't expect them to actually filter anything (because it doesn't exist in the reference). I feel that this is the test case in #588.

Case (4), where expected output matches, but evaluated doesn't, would lead to the opposite doctest passing:

# Expected output:
FOO:1234
# Evaluated output:
FOO:

Here I think the user definitely wants the doctest to fail if the number after FOO: has gone missing, since they clearly expect a number to be there. So we should not filter here either. This is the comments case however

Some closing thoughts:

I think using optional regex matches (i.e. r"([0-9]+)?" should work), although this may indeed have (performance) problems, since you're sorta matching empty strings.
If the optional regex doesn't work, maybe we can mark a filter in a way that it would still apply in case (4)? Opt-in for this seems fine.
Maybe we want some sort of "doctest preprocessing" functionality, where you can run filters on the expected output, before it goes into matching? This seems a bit complex, so I am not sure it's worth it though.

Sep 21 '23 23:09 mortenpi

Presumably this is the same bug, but this surprised me: in the doc markdown files, I have

```jldoctest; filter=[r"1\.9999\d+" => "2.0", r"0\.49999\d+" => "0.5", r"[ -]\d\.\d+e-\d\d" => "0.0", r"0\.9999\d+" => "1.0", r"1\.0000\d+" => "1.0"]
julia> from_points = [[0, 0], [1, 0], [0, 1]];

julia> to_points   = [[1, 1], [3, 1], [1.5, 3]];

julia> AffineMap(from_points => to_points)
AffineMap([2.0 0.5; 0.0 2.0], [1.0, 1.0])
```

and it seems to work:

julia> replace("AffineMap([1.9999999999999996 0.4999999999999999; -5.551115123125783e-16 2.0], [0.9999999999999999, 1.0000000000000002])", [r"1\.9999\d+" => "2.0", r"0\.49999\d+" => "0.5", r"[ -]\d\.\d+e-\d\d" => "0.0", r"0\.9999\d+" => "1.0", r"1\.0000\d+" => "1.0"]...)
"AffineMap([2.0 0.5; 0.0 2.0], [1.0, 1.0])"

and yet

~The insidious thing about that example is that it borks any ability to handle differences among CPUs wrt rounding.~ I was able to fix it by writing the filters as

filter=[r"(2\.0|1\.9999\d+)" => "2.0", r"(0\.5|0\.49999\d+)" => "0.5", r"(0\.0|[ -]\d\.\d+e-\d\d)" => "0.0", r"(1\.0(?!0)|1\.0000\d+|0\.9999\d+)" => "1.0"]

but that's pretty non-obvious.

Jan 08 '25 12:01 timholy

Can't your

FOO:1234

case be handled simply by writing the filter more carefully, e.g., r"FOO:\d+" => "FOO:"?

Jan 08 '25 13:01 timholy

I don't find the reasoning

Here I think the user definitely wants the doctest to fail if the number after FOO: has gone missing, since they clearly expect a number to be there.

for the example

# Expected output:
FOO:1234
# Evaluated output:
FOO:

very compelling. I could just as well argue "clearly the user expected 1234 to occur there, or "the user expected a number starting with 1 there". And what is with this example, which I expect will pass since the regex applies to both outputs?

# Expected output:
FOO:1234:1234
# Evaluated output:
FOO::1234

Overall for global regex filters I am sure it will be frequent that they get applied to things which the author did not have in mind -- I actually think that's a (difficult) design issue with the entire regex filter system. Which is not meant as a complaint, this is after all a difficult problem and I certainly don't have a great alternative in mind. I merely want to point out that this is IMHO a systematic issue.

Anyway, I don't expect to be able to change minds, and it may be too late anyway (perhaps even breaking people's existing tests).

So I hope it can at least be documented at https://documenter.juliadocs.org/stable/man/doctests/#Filtering-Doctests (and perhaps illustrated with an example). (Perhaps I can find some time to write something but I don't want to make promises I can't keep, sorry)

Jan 08 '25 13:01 fingolfin

Personally I'd say this is worth fixing in the long run (Documenter 2.0), but certainly documenting this limitation in the short run makes sense.

Jan 08 '25 13:01 timholy