Halide Incorrect results with nan and halide.maximum on a reduction domain

I am using TargetFeature.OpenCL with TargetFeature.CLDoubles.

Expected: halide.maximum behaves like np.max. halide.maximum on an array of nan should return nan.

Actual: LLVM codegen returns -inf and C codegen returns 0. Both are incorrect. By LLVM codegen, I mean using the output of Output.object and by C codegen, I am compiling Output.c_source from source.

def find_gpu_target() -> Target:
    host_target = get_host_target()
    return Target(host_target.os, host_target.arch, host_target.bits,
                  [TargetFeature.OpenCL, TargetFeature.CLDoubles])

i = Var()
block, thread = Var(), Var()

nan_buffer = Buffer(np.asarray([np.nan] * 2, dtype=float))
nan_max = Func("nan_max")

rdom = RDom([Range(0, 2)])
nan_max[i] = maximum(nan_buffer[rdom.x + i])
nan_max.gpu_tile(i, block, thread, 16, tail=TailStrategy.GuardWithIf)

root_dir = Path.cwd()
(root_dir / "include").mkdir(parents=True, exist_ok=True)

nan_max.compile_to(outputs={
    Output.c_header: str(root_dir / "include" / "reprod.h"),
    Output.c_source: str(root_dir / "reprod.cc"),
    Output.llvm_assembly: str(root_dir / "reprod.ll"),
    Output.object: str(root_dir / "reprod.o"),
    Output.stmt: str(root_dir / "reprod.txt"),
}, arguments=[], fn_name="consumer", target=find_gpu_target())

int main() {
    Halide::Runtime::Buffer<double> out(1);
    consumer(out);
    out.copy_to_host();
    std::cerr << *out.begin() << std::endl;
}

Apr 08 '22 13:04 knzivid

Are you compiling with strict_float? If not, Halide defaults to the equivalent of -ffast-math, which allows LLVM to assume that NaN and Inf never exist.

Apr 08 '22 17:04 steven-johnson

Good catch. I was not using it. When I add TargetFeature.StrictFloat, I still get the same output.

Apr 08 '22 17:04 knzivid