ndarray icon indicating copy to clipboard operation
ndarray copied to clipboard

Provide element-wise math functions for floats

Open KmolYuan opened this issue 4 years ago • 20 comments

Closes #992.

Add a new module impl_float_maths under std feature.

If there are missing anything you like, or something should not comes here, any suggestion are welcome!

For integer things, maybe it belongs to another PR.

Three types of operators done by three local macros:

  • boolean_op: Returns a boolean array, join them together by *_any methods.
  • unary_op: Unary functions.
  • binary_op: Binary functions.

New functions I made:

  • is_nan_any: A "any" operation under is_nan.
  • is_infinite_any: A "any" operation under is_infinite_any.
  • square: A x*x shortcut.
  • clip: Same as Numpy clip, done by using Float::max and Float::min. This function has a doctest, copied from Numpy.

Ported functions: (here is the Float trait)

  • is_nan
  • is_infinite
  • floor
  • ceil
  • round
  • trunc
  • fract
  • abs
  • signum
  • recip
  • powi
  • powf
  • sqrt
  • exp
  • exp2
  • ln
  • log
  • log2
  • log10
  • abs_sub
  • cbrt
  • sin
  • cos
  • tan
  • to_degrees
  • to_radians

KmolYuan avatar Jul 13 '21 03:07 KmolYuan

Looks to be in the right direction, that's nice - I'll be a bit unreachable during the summer, but back in a bit

bluss avatar Jul 19 '21 00:07 bluss

Could you actually implement the Float trait for an ndarray? You seem to have provided implementations for most if not all of the required methods.

The main reason is that these methods need to import Float trait, otherwise they are not allowed to be used. The Float trait is still has some requirement, I think this is not easy to solve, such as the Num trait.

Although these are not part of the Float trait, do you think there is room for greater_than, less_than, and the rest of the family for a scalar operand? ie array![1, 2].greater_than(1) -> array![false, true]

I'm willing to add more useful methods as you mentioned, if they are compatible with the requirements.

KmolYuan avatar Aug 19 '21 14:08 KmolYuan

The Float trait is still has some requirement, I think this is not easy to solve, such as the Num trait.

Ah yes you're right, implementing Float requires implementing Num, which requires from_str_radix(str: &str, radix: u32) -> Result<Self, Self::FromStrRadixErr> which doesn't make sense for an array. It also has to implement NumCast which doesn't make sense either.

I'm willing to add more useful methods as you mentioned, if they are compatible with the requirements.

What requirements are you talking about?

multimeric avatar Aug 19 '21 14:08 multimeric

What requirements are you talking about?

Oh, I mean the functions should be implemented with the Float trait.

KmolYuan avatar Aug 19 '21 14:08 KmolYuan

Looks like that won't be a problem if we don't bother with actually implementing Float.

multimeric avatar Aug 19 '21 14:08 multimeric

Additionally, I think there should be add a #[must_use] notation, same as the standard library.

#[must_use = "method returns a new array and does not mutate the original value"]

KmolYuan avatar Aug 20 '21 01:08 KmolYuan

I haven't added my general perspective, I think.

We have avoided these methods in the past for two reasons:

  • Focus on general ndarray features, not specific ones. mapv covers all of these. But I agree that these are useful, if we have contributors that want to maintain them
  • Focus on performance. These methods are not efficient, because they create new arrays - allocation and copying - for all operations. But they are convenient, and a lot of times the user doesn't need to care. But it's something to be aware of and keep in mind. We need to explain this to users as well, and we try in various places in the documentation. We provide things like Zip that allows the user to modify and transform elements in-place.

bluss avatar Nov 01 '21 19:11 bluss

These methods are not efficient, because they create new arrays - allocation and copying - for all operations. But they are convenient, and a lot of times the user doesn't need to care.

Could we not solve this by having in-place versions of all these methods? It would also force users to decide which one they need in each case instead of always picking the simpler code. Should be fairly simple to add by updating the macros.

multimeric avatar Nov 02 '21 00:11 multimeric

In-place versions are just a band aid, not a solution . Actual solutions does most the needed operations in one pass, not one by one (how do you handle chaining operations). I'd avoid in-place methods for now. 🙂

bluss avatar Nov 02 '21 06:11 bluss

Is this to do with loading the elements into CPU registers or something? Because otherwise I can't see why processing the whole array in one pass and then again in a second pass is any worse than applying the same transformations to each element sequentially. I would have thought that avoiding a new array allocation would dwarf this optimisation.

In any case if that would be a significant optimisation could we do some kind of chain pattern like:

arr.chain().powi(2).exp().eval()

multimeric avatar Nov 02 '21 06:11 multimeric

Is this to do with loading the elements into CPU registers or something?

With contemporary systems, moving the data from memory into caches and registers often dominates the runtime of numerical algorithms depending on how many arithmetic operations they perform relative to each memory access. Fusing operations from multiple passes into a single pass increases this ratio and thereby the likelihood of efficient CPU utilisation.

adamreichold avatar Nov 02 '21 07:11 adamreichold

@multimeric Good that we got to explain processor caches. This will help a lot when working with Rust.

Your point about eval is close to what I'm thinking of too. Not in particular that we do that here, but that we lay the groundwork for a "language", where you can call .exp() etc on arrays.

Then a shim/layer can be inserted that has the same or sameish interface but lazy evaluation, like in your sketch! Both the in-place instead of allocating and batching operations instead of one per loop ideas are important here. There's some prior art around this. I don't know eigen (C++) very well but I think they do something like this. And the very old Rust experiment algebloat also tried this.

bluss avatar Nov 03 '21 17:11 bluss

It would be exciting if

  1. We traitifed ndarray - not so many specific "Array", "ArrayView" in the signatures, instead just an impl NdArray<Elem=f64> or something would be enough.
  2. A lazy expression could implement the same trait, i.e you could pass array.lazy().exp() in place of an array.

I don't know what kinds of techniques can be used in Rust to make this possible. We also don't want to make the type system situation too hard to understand, unfortunately.

bluss avatar Nov 03 '21 17:11 bluss

Okay I'll move the discussion of this lazy eval into a new issue for now so we can get this PR merged.

multimeric avatar Nov 04 '21 01:11 multimeric

I wanted to say, if you can, please use rebase and not merges when updating the PR branch, we don't want merges crossing. There are of course exceptions to every rule, but in general this is what we want to do. I've force-pushed the branch with a rebase, since it was easy to do. (This is possible when maintainers are allowed to edit.) Feel free to squash together commits to clean up history if you'd like.

bluss avatar Nov 08 '21 19:11 bluss

In the spirit of issue #415 we might prefer to expose these methods with a bound like A: 'static + Float. 'static is for the moment the key to making ad-hoc type specialization (this is how we do it for matmul). I.e if we need to explicitly special case f32, f64 etc.

bluss avatar Nov 12 '21 22:11 bluss

Just to add this also solves https://github.com/rust-ndarray/ndarray/issues/1047 I think.

emmatyping avatar Nov 16 '21 12:11 emmatyping

fwiw, no rush, this PR is waiting on updates from the author

bluss avatar Dec 07 '21 17:12 bluss

Any update of this PR?

killme2008 avatar May 17 '22 10:05 killme2008

This would be a very useful addition.

Is there any more work that needs to happen so this can be merged?

fgsch avatar Sep 21 '22 14:09 fgsch

Looking forward to this...

danjenson avatar Nov 27 '22 18:11 danjenson

What's the status in this PR? Would be great to have all those methods implemented...

AnotherCoolDude avatar Nov 17 '23 09:11 AnotherCoolDude

Same as last time: "this PR is waiting on updates from the author"

Since this MR has been discussed and reviewed, I wouldn't mind merging it if @KmolYuan updates this MR.

nilgoyette avatar Nov 17 '23 15:11 nilgoyette

Same as last time: "this PR is waiting on updates from the author"

Since this MR has been discussed and reviewed, I wouldn't mind merging it if @KmolYuan updates this MR.

Sure, if the API & docs are accepted, I agree to merge this PR.

KmolYuan avatar Nov 20 '23 09:11 KmolYuan

@KmolYuan What about the clamp function? Shouldn't we use the one from std?

clamp is stable since 1.50 and the minimum Rust version in this project is 1.51.

nilgoyette avatar Nov 24 '23 16:11 nilgoyette

@KmolYuan What about the clamp function? Shouldn't we use the one from std?

clamp is stable since 1.50 and the minimum Rust version in this project is 1.51.

Thanks, I'll use num_traits::clamp for generic floating number types.

KmolYuan avatar Nov 26 '23 12:11 KmolYuan

@adamreichold Any idea why the CI tests fail? You repaired them in the last merged MR, so I would have thought that the tests would still be ok. It's still using 1.51 and the errors are not related to this MR.

nilgoyette avatar Dec 02 '23 01:12 nilgoyette

@adamreichold Any idea why the CI tests fail? You repaired them in the last merged MR, so I would have thought that the tests would still be ok. It's still using 1.51 and the errors are not related to this MR.

These errors are due to new lints introduced into rustc itself since then. It does check pub use in addition to use now and checks against collision with possibly upcoming but still unstable names. I think we have to fix the former and suppress the latter for now, c.f. #1337

adamreichold avatar Dec 02 '23 06:12 adamreichold

Thank you @KmolYuan, and sorry for the delay.

nilgoyette avatar Dec 06 '23 02:12 nilgoyette

I am sorry to reopen this discussion. For some reason, my project doesn't pick up the merged changes. I would expect a version bump that tells cargo to update the package. I could probably refer to the commit specifically, but that doesn't feel right. Is that a mistake on my side?

AnotherCoolDude avatar Dec 12 '23 12:12 AnotherCoolDude