hpkfft.com comments

Results 23 comments of


                                            hpkfft.com

Specify correct rounding for sqrt

By default, the CUDA compiler sets `-prec-div=true`, `-prec-sqrt=true`, and `-ftz=false`. https://docs.nvidia.com/cuda/floating-point/index.html#compiler-flags The CuPy library is compiled with `-ftz=true` (overriding the default for this particular flag). Thanks, @leofang, for this info.

Specify correct rounding for sqrt

Note that for IEEE square root a subnormal result can never be produced. So, a hardware mode that flushes subnormal results to zero is irrelevant.

Minor clarification on allowed rounding mode

By the way, I would support a recommendation that array libraries offer directed rounding as an option. One way to try to assess the effects of roundoff upon a floating-point...

Is there a 3.13 nogil container?

Given the significant improvements to the free-threaded mode in Python 3.14, I would suggest prioritizing #1052 over providing a container for free-threaded 3.13.

Return value policy gray zone

A quick thought (that might appear simpler to developers) would be `rvp::lookup | rvp::move` instead of `return_existing_or_move`.

Return value policy gray zone

Is it useful to have both `copy` and `move`? I haven't had the occasion to use this aspect of nanobind much, so I may be missing something obvious. However, I'm...

[BUG]: Runtime is not as expected

Just some quick thoughts: 1. Are you using `cmake -DCMAKE_BUILD_TYPE=Release` 2. If you are (and you should), I think the default is `O3`, so you probably don't need `add_compile_options(-O3)` 3....

Misspelled freethreading in 3.14.0b3

I'm OK with any decision that is made; I just want my own documentation to be correct and professional. If Python wants to declare that one should use `freethreading` as...

[Bug] - Include file omp.h not found by clang-18

Note that I have installed `libomp-devel`. The only version I see available is `15.0.7-5.amzn2023.0.1`. If it's helpful for testing, here's a sample program, `test.cpp` ``` #include #include int main() {...

[Bug] - Include file omp.h not found by clang-18

Yes, thank you, or use compiler flag `-I/usr/lib64/clang/15.0.7/include` I just thought the Amazon linux packaging team would like to know about this issue.