library-checker-problems icon indicating copy to clipboard operation
library-checker-problems copied to clipboard

[Problem proposal] Integer Sort

Open adamant-pwn opened this issue 1 year ago • 5 comments

Problem name: Integer Sort (Optional) Problem ID: sort_integers

Problem

Given $n$ integers $A_1,\dots,A_n$, sort them.

Constraint

  • $1 \leq n \leq 10^7$, $1 \leq A_i \leq 10^9$.

Solution / Reference

  • Any fast sorting really

(Optional) Input

n
A1 ... An

(Optional) Output

B1 ... Bn

where $B_1 \leq \dots \leq B_n$.

(Optional) Note / Disucussion

  • Should ints be up to $10^{18}$?
  • Should we use other $n$? Large is suggested to make sure time difference from sort is visible enough.
  • Is there a way to provide a meaningful non-integer input to force a comparison-based sort?

adamant-pwn avatar Nov 13 '24 16:11 adamant-pwn

For generic comparison-based sort, we also already have Sort Points by Argument, but the constraints seem not large enough for it to be a nice sorting benchmark :thinking:

adamant-pwn avatar Nov 13 '24 16:11 adamant-pwn

I also wonder whether it's best to just give one array, or a lot of arrays and guarantee their total size is up to $10^7$? So that both the case of one large array, and a lot of small arrays are covered.

adamant-pwn avatar Nov 14 '24 18:11 adamant-pwn

In my benchmarks, a small constant radix sort can sort a random array of length $10^7$ in just 0.08s. However, when I add input and output (using library_checker's fastio), the total time increases to 0.27s. This shows that reading n integers becomes a bottleneck.

Theoretically, The complexity of reading $N$ integers is $O(n \log V)$.

Code (the radix sort implementation is adapted from platelet's code):

#include <algorithm>
#include <boost/timer/timer.hpp> // for accurate cpu time
#include <iostream>

#include "fastio.hpp" // library checker's 

constexpr int N = 10'000'000;

unsigned int a[N], b[N];

void sort(unsigned int* a, int __n) {
    unsigned int n = __n;
    unsigned int buc[4][256] = {};
    for (unsigned int i = 0; i < n; i++) {
        buc[0][a[i] & 255]++;
        buc[1][a[i] >> 8 & 255]++;
        buc[2][a[i] >> 16 & 255]++;
        buc[3][a[i] >> 24]++;
    }
    for (unsigned int k = 0; k < 4; k++) {
        unsigned int offset = 0;
        for (unsigned int i = 0; i < 256; i++)
            std::swap(buc[k][i], offset), offset += buc[k][i];
    }
    for (unsigned int i = 0; i < n; i++) b[buc[0][a[i] & 255]++] = a[i];
    for (unsigned int i = 0; i < n; i++) a[buc[1][b[i] >> 8 & 255]++] = b[i];
    for (unsigned int i = 0; i < n; i++) b[buc[2][a[i] >> 16 & 255]++] = a[i];
    for (unsigned int i = 0; i < n; i++) a[buc[3][b[i] >> 24 & 255]++] = b[i];
}

int main()
{
  library_checker::Scanner sc(stdin);
  library_checker::Printer pr(stdout);

  int n = 0;
  sc.read(n);

  for (int i = 0; i < n; i++) sc.read(a[i]);

  {
    boost::timer::auto_cpu_timer t(std::cerr);
    sort(a, n);
  }

  for (int i = 0; i < n; i++) {
    if (i) pr.write(' ');
    pr.write(a[i]);
  }
  pr.writeln();

  return 0;
}

result:

/tmp/tmp.XN9SL063Zy via 🐍 v3.12.7 
❯ python gen.py > in.txt 

/tmp/tmp.XN9SL063Zy via 🐍 v3.12.7 took 4s 
❯ ls -lh in.txt 
-rw-r--r-- 1 emsger emsger 103M 11月 15 16:06 in.txt

/tmp/tmp.XN9SL063Zy via 🐍 v3.12.7 
❯ time ./test < in.txt > /dev/null 
 0.090000s wall, 0.080000s user + 0.000000s system = 0.080000s CPU (88.9%)

real    0m0.313s
user    0m0.269s
sys     0m0.042s

EarthMessenger avatar Nov 15 '24 08:11 EarthMessenger

Good point. Might be worthwhile to give input as a generator, similar to KSMALL? But then it might be difficult to add special cases, like breaking certain quicksorts, or giving already sorted array...

adamant-pwn avatar Nov 16 '24 12:11 adamant-pwn

Regarding output, if necessary, it is possible to output an appropriate hash. As for input, aside from randomly generated methods, it is unavoidable that near-fastest submissions will require handling of input using SIMD.

As an alternative, we can offer C++ (Function) style submissions (e.g., https://judge.yosupo.jp/problem/convolution_mod). In this case, it is possible to compare execution times excluding input and output among submissions of this language.

maspypy avatar Nov 18 '24 13:11 maspypy