root icon indicating copy to clipboard operation
root copied to clipboard

Adding LinSpace, LogSpace, and Arange prototypes

Open edfink234 opened this issue 9 months ago • 21 comments

This Pull request:

Changes or fixes:

Adds linSpace, logSpace, and arange functions to ROOT::VecOps namespace

Checklist:

  • [x] tested changes locally
  • [x] updated the docs (if necessary)

Fixes https://github.com/root-project/root/issues/17855

edfink234 avatar Mar 30 '25 21:03 edfink234

Ah, and also, the equality of double precision floating point is not working too well because of the very meaning of equality of floating point ;-)

dpiparo avatar Mar 31 '25 09:03 dpiparo

Test Results

    22 files      22 suites   3d 15h 57m 59s ⏱️  3 692 tests  3 687 ✅ 0 💤 5 ❌ 79 277 runs  79 272 ✅ 0 💤 5 ❌

For more details on these failures, see this check.

Results for commit 7fd6a3b8.

:recycle: This comment has been updated with latest results.

github-actions[bot] avatar Mar 31 '25 18:03 github-actions[bot]

Hi @dpiparo

Thank you for the quick response and feedback!

For this next commit, I’ve updated the functions as follows:

  • LinSpace, LogSpace, and Arange are now templated to deduce a common arithmetic type that defaults to double when necessary
  • I’ve adjusted parameter naming to match VecOps conventions by using a lowercase 'n' with a default power-of-two value (128).
  • The Doxygen documentation has been expanded to fully detail the template parameters, function behavior, usage examples, and to clarify key implementation details such as the check against LLONG_MAX.
  • I’ve updated the tests to use approximate (tolerance-based) comparisons for floating-point values with a CheckNear function

edfink234 avatar Apr 01 '25 08:04 edfink234

Dear @edfink234 thanks. I am reviewing the code. One thing which we will need is to provide a clean commit, properly formatted with all the changes - it is fine to have some "history" when crafting PRs but we try to keep the history of the repo as clean as possible.

dpiparo avatar Apr 01 '25 08:04 dpiparo

Thanks for the quick response @dpiparo! I will get to these hopefully today (Tuesday PST) if not at some point this week. It's almost 2 am where I am atm 😅...

edfink234 avatar Apr 01 '25 08:04 edfink234

@dpiparo I made another commit to address the new comments and modified the tests a bit to hopefully have them pass.

edfink234 avatar Apr 02 '25 01:04 edfink234

Thanks @edfink234 . Let's see how this goes.

dpiparo avatar Apr 02 '25 19:04 dpiparo

Thanks a lot for this!

I have some suggestions, since I think it would be good to mimick python numpy more:

  • Linspace/Logspace should accept function argument const bool endpoint = false/true; // for lin/log, see https://numpy.org/doc/2.1/reference/generated/numpy.linspace.html https://numpy.org/doc/2.1/reference/generated/numpy.logspace.html
  • when dtype=int, we should template-specialize, ie not static_cast the result, but rather "round towards -inf", as explained in Numpy docu.
  • Could Lin/Arange's default value of Common_t be automatically inferred based on the input args, as done in Numpy?. This would make it easier to do for(unsigned int i : Linspace<unsigned int>(0, 10, 11)) instead of currently for(unsigned int i : Linspace<unsigned int, unsigned int, unsigned int>(0, 10, 11)). If this is not possible, I would add a doxygen \note saying that this is different from Numpy.
  • Related to the previous point, in the Doxy docu, we could say that with C++23, you could just do a python-like enumerate for loop:
for (auto const [index, val] : std::views::enumerate(ROOT::VecOps::Linspace(6,10,16)))

ferdymercury avatar Apr 06 '25 08:04 ferdymercury

Thanks a lot for this!

I have some suggestions, since I think it would be good to mimick python numpy more:

  • Linspace/Logspace should accept function argument const bool endpoint = false/true; // for lin/log, see https://numpy.org/doc/2.1/reference/generated/numpy.linspace.html https://numpy.org/doc/2.1/reference/generated/numpy.logspace.html
  • when dtype=int, we should template-specialize, ie not static_cast the result, but rather "round towards -inf", as explained in Numpy docu.
  • Could Lin/Arange's default value of Common_t be automatically inferred based on the input args, as done in Numpy?. This would make it easier to do for(unsigned int i : Linspace<unsigned int>(0, 10, 11)) instead of currently for(unsigned int i : Linspace<unsigned int, unsigned int, unsigned int>(0, 10, 11)). If this is not possible, I would add a doxygen \note saying that this is different from Numpy.
  • Related to the previous point, in the Doxy docu, we could say that with C++23, you could just do a python-like enumerate for loop:
for (auto const [index, val] : std::views::enumerate(ROOT::VecOps::Linspace(6,10,16)))

I tried my best to address these, though for the second and third points, I don't know how to cleanly add a specialization without incurring an ambiguous compiler error wrt being unable to deduce the correct overload.

edfink234 avatar Apr 07 '25 09:04 edfink234

I tried my best to address these, though for the second and third points, I don't know how to cleanly add a specialization without incurring an ambiguous compiler error wrt being unable to deduce the correct overload.

Not sure, maybe it's easier to use just a ternary operator such as:

double result = std::is_floating_point ? static_cast : floor ;

ferdymercury avatar Apr 07 '25 11:04 ferdymercury

@dpiparo @vepadulano @ferdymercury I made two commits. The first one was changing temp.reserve(n); temp.push_back(val) in Linspace/Logspace/Arange temp(n); temp[i] = val; as I ran numerous tests to confirm that the latter is more efficient and easier for the compiler to optimise. The second commit is a separate commit that merges T1, T2, T3 into one template type T in case you prefer this version (which I gather may be the case); of course for this separate commit the test cases have been modified accordingly.

edfink234 avatar Apr 20 '25 22:04 edfink234

Thanks a lot!! From my side: this would be the only change left to do:

long double step = std::is_floating_point_v<CommonT> ? (end - start) / (n - endpoint) : static_cast<long double>(end - start) / (n - endpoint);
temp[i]=(std::is_floating_point_v<CommonT> ? static_cast<Ret>(start + i * step) : std::floor(start + i * step));

Since step must be floating even if CommonT is int. And for high N, we avoid extrapolation errors by always choosing doubles (or long doubles) even if CommonT is float.

ferdymercury avatar Apr 21 '25 06:04 ferdymercury

Thanks for the comments! I will get to these this weekend.

Just as an aside, below are some quickbench results for comparing temp.reserve(n); temp.push_back(val) vs temp(n); temp[i]=val:

edfink234 avatar Apr 25 '25 17:04 edfink234

Hi @dpiparo @vepadulano @ferdymercury

I added the changes suggested by @vepadulano and @ferdymercury. I'm not sure how I feel about the suggestion of @ferdymercury for step if I understand the need for all these casts and what the floor does exactly, but ok, I changed it and checked that it yields the expected output.

edfink234 avatar Apr 27 '25 22:04 edfink234

Hi @ferdymercury

I just pushed the new commit with these changes!

edfink234 avatar May 02 '25 00:05 edfink234

Hi @ferdymercury @dpiparo @vepadulano! I made a commit with the most recent changes suggested by @ferdymercury.

edfink234 avatar May 11 '25 21:05 edfink234

@ferdymercury @vepadulano @dpiparo I made the updates!

I also just wanted to share an updated benchmark using the actual ROOT RVec type instead of std::vector. Running the attached cpp file with the compilation directive g++ -std=c++20 -o linspaceTestAlternateRVec linspaceTestAlternateRVec.cpp -O2 -fno-inline-functions $(root-config --libs --cflags)

RVec

Linspace benchmark Time elapsed = 0.108864
Logspace benchmark time elapsed = 0.853611
Arange benchmark time elapsed = 0.0218328

This is better than I get with std::vector:

Linspace benchmark Time elapsed = 0.541573
Logspace benchmark time elapsed = 1.34327
Arange benchmark time elapsed = 0.0794047

vs Python numpy (python linspaceTest.py):

Linspace time elapsed = 2.097444 seconds
Logspace time elapsed = 4.336540 seconds
Arange time elapsed = 0.135273 seconds

LinspaceLogspaceArangeNumpyVsRVec.zip

edfink234 avatar May 19 '25 00:05 edfink234

Thanks a lot! LGTM, awesome work.

As a final comment, it's inconsistent that endpoint is const but not n, start and end. Same when you define n in Arange function within the function code. I think they should be either all const or none. (I guess neither choice will affect your nice benchmark speed if compiled with -O2).

Thanks for the comment. I was looking into marking pass-by-value parameters as const vs non-const and stumbled upon this lengthy Stack Overflow post with mixed opinions... I looked at the RVec.hxx file and it seems like the convention is to only mark parameters with default arguments as const and just leave regular pass-by-value parameters without the const specifier...

edfink234 avatar May 25 '25 22:05 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it...

edfink234 avatar Jun 01 '25 22:06 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it...

edfink234 avatar Jun 08 '25 21:06 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it.

edfink234 avatar Jun 15 '25 21:06 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it.

edfink234 avatar Jun 22 '25 21:06 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it.

edfink234 avatar Jul 06 '25 21:07 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it.

edfink234 avatar Jul 13 '25 21:07 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it.

edfink234 avatar Jul 20 '25 21:07 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it.

edfink234 avatar Jul 27 '25 21:07 edfink234

@vepadulano @dpiparo Any updates? I wasn't sure from the discussion about std::views if it was wanted to delete this documentation about it.

edfink234 avatar Aug 03 '25 21:08 edfink234

Is this PR still being considered for merging in the near future? It seems to me that this PR has gone a bit stale...

edfink234 avatar Sep 14 '25 21:09 edfink234

Ok, I believe I've addressed this comment, thanks @vepadulano

edfink234 avatar Oct 23 '25 06:10 edfink234

thanks again for this great functionality!

dpiparo avatar Oct 28 '25 16:10 dpiparo