arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[C++][Compute] Add quotient and modulo kernels

Open asfimport opened this issue 4 years ago • 10 comments

Add a pair of binary kernels to compute the:

  • quotient (result after division, discarding any fractional part, a.k.a integer division)

  • mod or modulo (remainder after division, a.k.a % / %% / modulus).

    The returned array should have the same data type as the input arrays or promote to an appropriate type to avoid loss of precision if the input types differ.

Reporter: Ian Cook / @ianmcook

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-12755. Please see the migration documentation for further details.

asfimport avatar May 12 '21 14:05 asfimport

Antoine Pitrou / @pitrou: Perhaps this could be a single "divmod" kernel if that makes the computation more efficient?

(note "divmod" is the Python name for that functionality)

Should also choose what happens for negative divisors.

asfimport avatar Jun 10 '21 16:06 asfimport

yibocai#1: For the record, libdivide may be useful for optimal integer division performance. https://github.com/ridiculousfish/libdivide

NOTE: this is only for vector/scalar, when there's only one divisor.

asfimport avatar Jul 29 '21 01:07 asfimport

yibocai#1: I agree "divmod" to return quotient and remainder at once is more efficient. x86 idiv instruction always returns quotient together with remainder. In computation cost, div = mod = divmod. Arm sdiv instruction does division only. So div < mod = divmod. We can use the existing "division" kernel if we only want the quotient. https://godbolt.org/z/fvGh3vh3f

asfimport avatar Jul 29 '21 01:07 asfimport

Eduardo Ponce / @edponce: @cyb70289 thanks for the tips!

asfimport avatar Jul 29 '21 02:07 asfimport

Krisztian Szucs / @kszucs: Since the PR is in draft, I'm postponing to 8.0

asfimport avatar Jan 14 '22 14:01 asfimport

Apache Arrow JIRA Bot: This issue was last updated over 90 days ago, which may be an indication it is no longer being actively worked. To better reflect the current state, the issue is being unassigned per project policy. Please feel free to re-take assignment of the issue if it is being actively worked, or if you plan to start that work soon.

asfimport avatar Oct 13 '22 17:10 asfimport

Dr. Jan-Philip Gehrcke: Bonjour! I wanted to use the modulo operator today in a filter expression in a scanner. Antoine and Jacob pointed me here. https://github.com/apache/arrow/pull/11116 has stalled a bit seemingly. If I can, I would love to help land that.

asfimport avatar Dec 15 '22 12:12 asfimport

Eduardo Ponce / @edponce: Hi, [~jgehrcke]! I will be glad to share with you this PR and what is missing to get it over the finish line. As I understand this PR is mostly complete, but it requires to be fully rebased (I did some minor updates a few minutes ago) and we need to add more unit tests.

asfimport avatar Dec 15 '22 16:12 asfimport

A modulo kernel would also be nice for duration types (which are just integers under the hood). For instance, in time series a common task is autodetect the frequency at which time stamps are provided, which can be done by computing the GCD of all time deltas.

randolf-scholz avatar Apr 24 '24 14:04 randolf-scholz

Related discussion: [DISCUSSION][C++] Re-proposal of 'divmod' in Compute Functions https://lists.apache.org/thread/h3t6nz1ys2k2hnbrjvwyoxkf70cps8sh

kou avatar Jun 25 '25 21:06 kou