num-integer icon indicating copy to clipboard operation
num-integer copied to clipboard

Implement Power10 trait, with first method is_power_of_ten

Open mohrezaei opened this issue 6 years ago • 20 comments

Here is the first method in the new trait. Just sending it across for a quick review before I start adding more.

Do you prefer

  • one pull request, one commit, all methods
  • one pull request, one commit per method
  • one pull request per method?

mohrezaei avatar Nov 15 '18 20:11 mohrezaei

I think I need help with fixing the u128 case. I can't seem to hide the static array from the compiler with a cfg attribute.

I don't understand why it works with 1.20, which doesn't have u128, but not older...

mohrezaei avatar Nov 15 '18 20:11 mohrezaei

Looks like I managed to hide u128 :smirk: If there is a better way, let me know.

mohrezaei avatar Nov 15 '18 21:11 mohrezaei

FYI, this is making things a bit difficult.

mohrezaei avatar Nov 16 '18 18:11 mohrezaei

I prefer one PR with all planned methods, so we can treat it as ready-to-publish once merged, without further breaking changes. Adding more methods on a trait is breaking unless there's a default impl.

I don't care how many commits are in your PR, though I do prefer a clean working progression over seeing useless "fix it" kind of commits.

I don't understand why it works with 1.20, which doesn't have u128, but not older...

I believe the reason is that 1.20 has unstable support for u128, so it at least knows how to parse the big literals, where earlier versions didn't. Hiding it in a macro allowed those earlier versions to discard it without trying to tokenize the contents. An alternative workaround would be to use a const expression of smaller literals, like 10000000000 * 10000000000.

cuviper avatar Nov 16 '18 21:11 cuviper

Hmm, actually it looks like I haven't done a crate-wide fmt here before, so that would be a lot of noise in this PR. I need to get to that later...

cuviper avatar Nov 16 '18 21:11 cuviper

So I need a little help/sounding board here. After looking more closely at the performance, I have two different 64-bit implementations (perfect_hash, lz) that behave like so:

RUSTFLAGS="" perfect_hash: 9000 ns/iter. lz: 7300 ns/iter.

RUSTFLAGS="-C target-cpu=native" perfect_hash: 4100 ns/iter lz: 5000 ns/iter.

I think perfect_hash vectorizes better. I can't make up my mind about which one is the right one for a general purpose library :frowning_face: help?

mohrezaei avatar Nov 16 '18 22:11 mohrezaei

Those numbers look close enough that I'd lean toward whichever will be easier to maintain.

The vectorization of your microbenchmark may or may not represent how well it can be optimized in real use, where you're more likely to have other computations involved too. So try not to extrapolate too much from these results.

cuviper avatar Nov 16 '18 23:11 cuviper

I agree that vectorization is not an important consideration. When I turn off vectorization, lz benefits from native, but perfect_hash does not. I was also worried about 32bit platforms, but a quick check shows that lz with target-cpu=native is better, but worse without native at least on i686.

On balance, I think I'll go with lz for u64. Gotta run the numbers for u32.

Thanks.

mohrezaei avatar Nov 16 '18 23:11 mohrezaei

This should be a good basis for a full review. I did a cargo fmt just on the new files. The arrays became quite long, but I left them as fmt wanted. I'll squash again after review edits.

A few things:

  • There is a bit of a naming issue right now: Ten vs 10. Power10, floor_log10 vs. next_power_of_ten, is_power_of_ten
  • I modeled the wrapping/checked/next_power_of_ten the same way as std, but floor_log10 just returns 0 for 0 (no alternate methods).
  • All methods take &self. That makes a lot of sense for a trait that may be used for things other than primitives. std uses self for similar methods, which I feel is problematic for a Power2 trait down the road. I don't intend to implement Power2.

mohrezaei avatar Nov 19 '18 16:11 mohrezaei

@cuviper ping, just in case this got lost in the holiday shuffle :smile:

mohrezaei avatar Nov 28 '18 14:11 mohrezaei

Hi! This is still in my inbox to review, but I'll at least thrown opinions at your design questions. :)

  • There is a bit of a naming issue right now: Ten vs 10. Power10, floor_log10 vs. next_power_of_ten, is_power_of_ten

I think the method names are OK. The floats give a precedent for log10, and integer power_of_two gives a precedent for _ten, so we're on par with std inconsistency here. I could accept either way on the trait name, Power10 or PowerTen.

  • I modeled the wrapping/checked/next_power_of_ten the same way as std, but floor_log10 just returns 0 for 0 (no alternate methods).

I would expect 0 to be a panic case, because log(0) in any base is undefined.

You can also consider that log(1) is exactly 0, and log is strictly increasing, so floor_log(1) should be the least input which returns 0 -- leaving us with no answer for floor_log(0).

Aside on naming - I'm inclined to just call this log10 instead of floor_log10. I feel the flooring aspect is implicit when we're talking about unsigned integers, just like division.

  • All methods take &self. That makes a lot of sense for a trait that may be used for things other than primitives. std uses self for similar methods, which I feel is problematic for a Power2 trait down the road. I don't intend to implement Power2.

&self is appropriate, and I don't think it's a problem for Power2. Everything we would implement through std methods are Copy types, so they can just dereference to get self by value.

cuviper avatar Dec 01 '18 01:12 cuviper

Renamed floor_log10 to log10 and added checked/unchecked variants (wrapping didn't make sense?). Also made sure tests pass in release mode.

BTW, interesting curiosity: 0f64.log10().floor() == 0 (would've made a good excuse for the method name and zero behavior :sweat_smile: )

mohrezaei avatar Dec 07 '18 21:12 mohrezaei

BTW, interesting curiosity: 0f64.log10().floor() == 0

I get -inf: playground

And even that only makes from the positive side, lim{x->0+} log(x)

cuviper avatar Dec 07 '18 21:12 cuviper

Right, sorry, this got me confused: 0f64.log10().floor() as u32 == 0 playground

mohrezaei avatar Dec 07 '18 23:12 mohrezaei

It seems we ought to rather have a general Power trait with a method is_power_of(self, other) -> bool which checks whether self is a power of other.

strake avatar Dec 12 '18 19:12 strake

I suggested generic powers in #14 too, but I think it's still OK to specialize common bases.

cuviper avatar Dec 12 '18 19:12 cuviper

The last commit should address the latest review comments. Let me know if there is more to do (I'll squash the commits once the review is done).

Oh, I should clarify one thing: the f64 stuff was mostly a joke. It's so hard to get that through in writing (sigh... :sweat:)

mohrezaei avatar Dec 14 '18 22:12 mohrezaei

The benchmarks are failing on nightly:

failures:

---- benchl10_u32_up_to_10000 stdout ----
thread 'benchl10_u32_up_to_10000' panicked at 'undefined value for log of zero', src/power10.rs:763:1
note: Run with `RUST_BACKTRACE=1` for a backtrace.

---- benchl10_u64_up_to_10000 stdout ----
thread 'benchl10_u64_up_to_10000' panicked at 'undefined value for log of zero', src/power10.rs:769:1

cuviper avatar Dec 14 '18 23:12 cuviper

It was the zero panic. Pushed a fix. BTW, the always panic behavior has a 50% cost in performance for u32 (from 0.8 ns per call to 1.2 ns per call).

mohrezaei avatar Dec 14 '18 23:12 mohrezaei

BTW, the always panic behavior has a 50% cost in performance for u32

Well, microbenchmarks are hard to extrapolate into real use cases, mixed with other computations. For instance, if the optimizer has enough range information from context to know the input can't be zero, then the check will probably be removed altogether.

cuviper avatar Dec 15 '18 00:12 cuviper