Nim Add bigints to standard library

Summary

Add arbitrary precision integers to standard library

Solution

Add the bigints library (it is in nimble) to standard library

https://github.com/def-/nim-bigints

Jun 17 '20 03:06 uninhm

Yeah, maybe we should really do that. The compiler itself would benefit.

Jun 17 '20 06:06 Araq

Before sanctioning a particular implementation of bingints into the stdlib, some performance improvements are in order. The fact is, there is also https://github.com/FedeOmoto/bignum, which is a wrapper aroung GMP, and if I recall correctly it is significantly faster (maybe things have improved lately). See also the discussion in https://forum.nim-lang.org/t/3677. The ideal would be to have a fast, pure nim implementation of bigints

Jun 17 '20 07:06 andreaferretti

https://github.com/nim-lang/RFCs/issues/228 would allow nicer syntax 123'bigint instead of initBigInt"123" but can be done as a syntax upgrade so is not a blocker.
https://github.com/FedeOmoto/bignum didn't get an update in 5y and basic usage seems to fail (https://github.com/FedeOmoto/bignum/issues/2) /cc @FedeOmoto , but could be revived; that being said, if it's a wrapper, it may not work at CT unless using CT FFI

The ideal would be to have a fast, pure nim implementation of bigints

for something like bigint, I'd say performance matters more than pure nim implementation if I had to choose between the 2, unless performance difference is small enough. But it also has to work at CT, which requires either: CT FFI, pure nim implementation fallback for the VM, or integrating via vmops (which is not ideal as it ties the compiler to a bignum library)

Jun 17 '20 07:06 timotheecour

For the Nim compiler I don't care much about the speed, it's much more important that it has no dependencies.

Jun 17 '20 07:06 Araq

def-/nim-bigints is maintained by contributors. It is not a good candidate.

Jun 17 '20 10:06 planetis-m

Why not? Sometimes software is finished.

Jun 17 '20 10:06 Araq

Why not? Sometimes software is finished.

If you look at the TODO, it's clear that this isn't finished yet. It's a good start though.

I think it's important to have a pure Nim BigInt library, rather than calling out to some potentially faster other (C) based library. IMO, Nim is a systems programming language, and if we can't have a Nim (maybe Nim + asm) library that's roughly as performant as a C + asm library, then Nim needs to be enhanced.

Jun 18 '20 00:06 bpr

Personally I'd prefer stint due to it's maturity, but it appears that the library depends on parts of nim-stew...

Also, shouldn't something like this be added to fusion instead of the stdlib?

Jun 18 '20 02:06 alaviss

I think varints can be replaced by a new bigints on stdlib, then stdlib remains kinda constant in size.

Jun 18 '20 03:06 juancarlospaco

I think varints can be replaced by a new bigints on stdlib, then stdlib remains kinda constant in size.

Er uh, I think that's not how it works. ;-)

Jun 18 '20 05:06 Araq

I was thinking varints suppose to be a bigint kinda thing but was never completely implemented, sorry.

Jun 18 '20 14:06 juancarlospaco

GMP is a big no-no due to the license and not working at compile-time

I don't mind contributing a bigint from scratch, it's not like I wrote those 3 times already:

Stint: https://github.com/status-im/nim-stint
Stint current refactor: https://github.com/status-im/nim-stint/pull/104
Constantine: https://github.com/mratsim/constantine/blob/master/constantine/arithmetic/limbs.nim

Note that Constantine backend or Stint current refactor are either 2x faster than GMP on modular arithmetic to sometimes slower for bigints below 1024 bits (didn't test much beyond and didn't test on non-modular arithmetic)

The main reasons behind stew dependencies are threefold:

Nim endians doesn't work at compile-time and calls integers what are actually raw bytes
No utilities for seq[byte] and byte arrays in the stdlib
bitops noUndefined returning 0 instead of the bitwidth of the type forces everyone to reimplement countLeadingZero or countTrailingZero: https://github.com/nim-lang/Nim/blob/234f4a27e1ff30100f446beb0ec3cb33d7e7b36a/lib/pure/bitops.nim#L397-L399. I am unaware of any use-cases where 0 is the correct answer.

In terms of BigInt usage, I would need to know more about the use-cases:

What kind of API: out-of-place like integers proc +(a, b: BigInt): BigInt or in-place like GMP
Do we target embedded? Because BigInt require dynamic memory management, and so the API would have to be gmp-like instead of "integer-like" i.e. only in-place operations
Do we want it thread-safe?
Do we assume that people might do BigInt operation in a loop, in that case if we use the out-of-place API we might want a caching allocator as memory allocation will be a bottleneck

Jun 18 '20 14:06 mratsim

If the compiler needs big integers, shouldn't it finally start to use Nimble packages? stint, as it exists right now is is a mature implementation of fixed-size integers. Duplicating it wouldn't be beneficial for anyone. The notion that the standard library is what the compiler needs to use is a damaging policy that prevents the Nim team to benefit from collaboration with any other team. It wouldn't be that hard to add a bunch of git submodules and -p:<path> directives in the compiler nim.cfg file and solve the problem in an afternoon.

Jun 18 '20 15:06 zah

@zah Agreed. However, I'm in no rush in adding big integer support for Nim and busy with other work considered more important.

Jun 18 '20 15:06 Araq

If the compiler needs big integers, shouldn't it finally start to use Nimble packages? stint, as it exists right now is is a mature implementation of fixed-size integers. Duplicating it wouldn't be beneficial for anyone. The notion that the standard library is what the compiler needs to use is a damaging policy that prevents the Nim team to benefit from collaboration with any other team. It wouldn't be that hard to add a bunch of git submodules and -p:<path> directives in the compiler nim.cfg file and solve the problem in an afternoon.

But, in competitive programming, the bigints are important, and you can't use external libraries

Jun 18 '20 15:06 uninhm

What is competitive programming? Sites like HackerRank? IOI competitions? No libraries allowed there? Can't bring your own hard drive?

Jun 18 '20 16:06 zah

If it helps competitive programming it's a side-effect but I don't think Nim should fall into competitive-programming driven development. Unless those people doing competitive programming have demonstrated that they stick around in the Nim community and so we can rely on them to maintain the code in the standard library (which is the main blocker for any module there).

Jun 18 '20 16:06 mratsim

Maybe some inspiration can be D stdlib BigInt ?: https://dlang.org/phobos/std_bigint.html

I think competitive programming is an example use case, but maybe others benefit too, like Scientific Nim projects.

Jun 18 '20 16:06 juancarlospaco

If the compiler needs big integers, shouldn't it finally start to use Nimble packages?

I agree; what this would require is:

locking down all recursive dependencies to avoid a situation in which some unrelated change breaks bootstrap
introducing fork mirrors regularly updated for each recursive dependency to avoid a situation in which a bad force push breaks bootstrap

Jun 18 '20 18:06 timotheecour

@timotheecour, tracking the dependencies as git submodules solves both problems. You only have to do a little bit of manual work in adding the transitive dependencies by yourself.

Jun 18 '20 18:06 zah

@zah since you brought this up can you please create either an issue or RFC so it can be discussed exclusively and separately in a dedicated location instead of here (it's more general than bigint) ? I'll followup there

Jun 18 '20 19:06 timotheecour

proposal

mini-gmp (subset of https://gmplib.org/) could be a good candidate to port/wrap:

single header, single source, no dependency
API-compatible with gmp so users could link against -lgmp for gmp-performane
could be used at CT using vmops
most importantly, performance is better than other alternatives apart from gmp itself:

The performance target for mini-gmp is to be at most 10 times slower than the real GMP library, for numbers of size up to a few hundred bits.

the only potential concern is licensing; as I understand, dynamic linking to gmp shouldn't cause issues under LGPL; if that's a problem (eg static linking needed), an alternative (slower but binary compatible) implementation could be bundled with nim, and users could opt to use mini-gmp or gmp for performance as needed

GMP is a big no-no due to the license and not working at compile-time

vmops should solve the problem working at CT (that's how new hashes work in VM for eg); and there's also --experimental:compiletimeFFI

benchmark: computing `factorial(100)` 1 million times

obviously a very partial benchmark but it's a start, feel free to suggest improvements/corrections

(all tests with clang -O2 or similar; on a recent mac)

# times in seconds
gmp: 1.48
EDIT: kaushalmodi/bignum (nimble install bignum): 1.50
mini-gmp: 2.24
python: 11s
std.bigint (D): 12s
nim using pkg/bigints: 14s
nim using pkg/stint: 41s
Big-Integer-C: 45s
tiny-bignum-c: 2260s

D:

for D's std.bigint (https://dlang.org/phobos/std_bigint.html) with:

ldmd2 -of/tmp/z05 -O -inline -release $timn_D/tests/bigints/z04.d
version   1.23.0 (DMD v2.093.1, LLVM 9.0.1)

import std.bigint;
import std.stdio;
void test1(bool isEcho){
  BigInt n = "100";
  BigInt ret = BigInt("1");
  while(n!=0){
    ret = ret * n;
    n = n-1;
  }
  if(isEcho) writeln(ret);
}

void main(){
  for(int i=0;i<1000000;i++){
    test1(false);
  }
  test1(true);
}

Big-Integer-C:

see test here: https://github.com/ilia3101/Big-Integer-C/pull/3

python

def test1():
  n = 100
  ret = 1
  while(n!=0):
    ret = ret * n
    n = n-1
  # print(ret)

def main():
  for i in range(0,1000000): test1()
main()

nim + pkg/bigints

import pkg/bigints

proc test1(isEcho: bool) =
  let n = 100
  var n2 = n.initBigInt
  var ret = 1.initBigInt
  while n2 != 0:
    # ret = ret * n2
    ret *= n2
    n2.dec
  if isEcho:
    echo ret

proc main=
  for i in 0..<1000000:
    test1(isEcho=false)
  test1(isEcho=true)

main()

nim + pkg/stint

https://github.com/status-im/nim-stint

import pkg/stint

proc test1(isEcho: bool)=
  const d = 1024
  var n = 100.stint(d)
  var ret = 1.stint(d)
  while n != 0.stint(d):
    ret = ret * n
    n = n-1.stint(d)
  if isEcho:
    echo ret

proc main=
  for i in 0..<1_000_000:
    test1(isEcho=false)
  test1(isEcho=true)
main()

EDIT: nim + pkg/bignum

this option (https://github.com/kaushalmodi/bignum) hasn't been considered in above thread; it's a fork of https://github.com/FedeOmoto/bignum that got a few updates to make it work (unfortunately under same nimble name see https://github.com/FedeOmoto/bignum/issues/3) it (unsurprisingly) gives same performance as gmp, via a high level wrapper around nim's nim gmp wrapper; test file used in above benchmark:

when true:
  import pkg/bignum
  proc test1(n,ret: var Int, isEcho: bool)=
    discard n.set 100
    discard ret.set 1
    while n != 0:
      ret *= n
      n.dec(1)
    if isEcho:
      echo ret

  proc main=
    var n = 100.newInt
    var ret = 1.newInt
    for i in 0..<1_000_000:
      test1(n, ret, isEcho=false)
    test1(n, ret, isEcho=true)
  main()

links

https://github.com/kokke/tiny-bignum-c GMP bignums vs. Python bignums: performance and code examples
The Fastest BigInt In The West – Wilfred Hughes::Blog https://www.xspdf.com/resolution/53301490.html
https://github.com/ykoba1994/bignumber.nim (see also: https://github.com/ykoba1994/bignumber.nim/issues/1)

alternative not evaluated: openssl bn

see https://www.openssl.org/docs/man1.0.2/man3/bn.html, its license should not cause concern

Oct 05 '20 01:10 timotheecour

Hard no for GMP in the compiler. The license would be a nightmare.

Coincidentally, I've started to export Constantine bigints internal into a self-contained variable-size bigint library https://github.com/SciNim/megalo

Regarding Stint perf, the blocker is this: https://github.com/status-im/nim-stint/issues/87 which leads to quadratic bad behavior. A complete rewrite of the backend is WIP: https://github.com/status-im/nim-stint/pull/104

Oct 05 '20 09:10 mratsim

considering that there are 5 options already for bigints in this thread and new ones keep popping up, and that the standard library is a graveyard for unmaintained code given how few people care to keep it up to date, perhaps it's wiser to let the dust settle before adding yet another library?

Dec 11 '20 14:12 arnetheduck

Dear Nim-devs,

About two years ago I wrote a C++ constexpr bigint library (https://github.com/niekbouman/ctbignum)

Recently, I gave Nim a try and translated a small part of the library to Nim. Maybe it could help as inspiration for a standard library bignum implementation.

kind regards, Niek

template doubleWidthType(x: type): type =
  when x is uint8:  
      uint16
  elif x is uint16:  
      uint32
  elif x is uint32:  
      uint64

template bitWidth(x: type): int =
  when x is uint8:  
      8
  elif x is uint16:  
      16
  elif x is uint32:  
      32
  elif x is uint64:  
      64
  
assert doubleWidthType(uint8) is uint16  
assert bitWidth(uint8) == 8 

func addIgnoreLastCarry[N: static int; T](a: array[N, T], b: array[N, T]): array[N, T] = 
  var carry = T(0)
  for i in 0..<N:
    let aa: T = a[i]
    let sum: T = aa + b[i]
    let res: T = sum + carry
    carry = T((sum < aa) or (res < sum))
    result[i] = res

func mul[M, N: static int; T](u: array[M, T], v: array[N, T]): array[M + N, T] =
  type TT = doubleWidthType(T)
  for j in 0..<N:
    var k = T(0)
    for i in 0..<M:
      var t : TT  = TT(u[i]) * TT(v[j]) + result[i + j] + k
      result[i + j] = cast[T](t)
      k = T(t shr bitWidth(T))
    result[j + M] = k

func partialMul[M, N: static int; T](L: static int, u: array[M, T], v: array[N, T]): array[L, T] =
  type TT = doubleWidthType(T)
  for j in 0..<N:
    var k = T(0)
    var m = min(M, L - j)
    for i in 0..<m:
      var t : TT  = TT(u[i]) * TT(v[j]) + result[i + j] + k
      result[i + j] = cast[T](t)
      k = T(t shr bitWidth(T))
    if j + M < L:  
      result[j + M] = k

func toDigitArray(t: type, str: static string): array[str.len(), t] =
  for i, c in str:
    let x = int(c) - 48
    assert 0 <= x and x < 10
    result[i] = t(x)

func tightLen[N: static int; T](x : array[N,T]) : int = 
  for i in countdown(x.len - 1 ,0):
    if x[i] != 0:
      return i + 1
  return 0

func parseBigintHelper(T: type, x: static string): auto =
  const l = x.len;
  const N = int(1 + (10 * l) / (3 * bitWidth(T)))
  let digits = toDigitArray(T,x)
  var num : array[N, T]
  var powOfTen : array[N, T]
  powOfTen[0] = T(1)
  for i in countdown(l - 1, 0):
    num = addIgnoreLastCarry(num, partialMul(N,[digits[i]], powOfTen))
    powOfTen = partialMul(N,[T(10)], powOfTen)
  return num

func parseBigint(T: type, x: static string): auto =
  const num = parseBigintHelper(typeof T, x)
  const l = tightLen(num)
  var result : array[l, T]
  for i in 0..<l:
    result[i] = num[i]
  return result  
  
template bn32(x: static[string]): auto =  
  parseBigint(uint32, x)    

const example = bn32"4294967296"
echo example

Dec 23 '20 21:12 niekbouman

@niekbouman judging from https://github.com/niekbouman/ctbignum/issues/39

With ctbignum, I am focusing on compile-time computations and run-time computations with integers whose limb size are known at compile time, and modular arithmetic with a modulus known at compile-time. I specifically target on operands with a few hundred bits (say, two to four 64-bit limbs), whereas mp++ seems to focus on implementing a "small-vector optimization" for big integers of arbitrary size.

in nim, for CT, performance sensitive code (like bigint/bignum) would best be handled via a vmops instead of relying on VM (eg, see hashVmImplByte), so CT aspect is not a primary concern, but performance at RT is IMO the primary concern. Do you have links to benchmarks comparing ctbignum to mp++ (https://github.com/bluescarni/mppp) ? Even better if it distinguishes between use cases (eg bigints of fixed size vs bigints of arbitrary size)

Maybe it could help as inspiration for a standard library bignum implementation.

help welcome; wrapping might be easier than porting, at least in the meantime

@mratsim

Hard no for GMP in the compiler. The license would be a nightmare.

What about porting and/or wrapping https://github.com/bluescarni/mppp instead:

its licensing is much better (https://github.com/bluescarni/mppp/blob/master/COPYING)
much faster than GMP: https://bluescarni.github.io/mppp/integer_benchmarks.html
more modern codebase

Dec 23 '20 22:12 timotheecour

mppp wraps GMP

Built on top of the GNU multiprecision stack (GMP, MPFR, MPC)

I looked into the implementation and it's only wrapping which means there is the same licensing woes.

Besides the implementation in cryptographic libraries (which don't support unsigned). The fastest runtime C++ library today is likely to be https://github.com/fabhaas/xenonis With the following limitations:

C++ requires the C++ backend which doesn't work at compile-time, in C, javascript or Wasm.
While wrapping ttmath (another bigint with fixed size), we had regular C++ bugs due to "flexible array member in union", destructors or something in that vein, there was easily a day every two week per person lost on C++/nim interop issues between C++ value types and Nim sequences.
- https://github.com/status-im/nim-ttmath/issues/10
- https://github.com/status-im/nim-ttmath/issues/14
C sources are not always recompiled if the object file is still present. This is a minor problem in C but with C++, we often had templates that were not reinstantiated. And xenonis uses template https://github.com/fabhaas/xenonis/blob/5ef0ab8/include/bigint.hpp#L32. meaning manual nimcache cleanup before compilation.

Alternatively, you can lift LLVM APint https://llvm.org/doxygen/classllvm_1_1APInt.html

So, my take is, it's fine to wrap any of those libraries but in the standard library we should have a pure Nim bigint library that doesn't have C++ gotchas.

Unfortunately for about 40 days my main computer was off just after I started Megalo and now I'm also trying to push CPS forward https://github.com/nim-lang/RFCs/issues/295.

That said, Megalo should be in a shape good enough so that someone can pick up the code and finish the non-compiletime non-assembly implementation:

No unknown about the high-level API: in-place mutation for performance (like GMP) + out-of-place "int-like" for ergonomics
Differential fuzzing vs GMP is ready: https://github.com/SciNim/megalo/blob/9e7effb/tests/t_fuzz_vs_gmp.nim#L92-L97
Serialization/Deserialization only supports hex but decimal is straightforward, see @niekbouman code or Stint: https://github.com/status-im/nim-stint/blob/9e49b00/stint/io.nim#L242-L261
The low-level is abstracted away under a common interface with pure Nim and uses compiler int128 or intrinsics when available: https://github.com/SciNim/megalo/tree/9e7effb/megalo/primitives (that doesn't prevent GCC from generating horrible code though: https://gcc.godbolt.org/z/2h768y)

So what's left?

Ensuring that addition handles the messy matrix negative/non-negative/ a > b / b < a:
- It is a big mess in xenonis as well: https://github.com/fabhaas/xenonis/blob/5ef0ab8/include/bigint.hpp#L32
- (optional) To reduce clutter, I swap the inputs but I have to use unsafeAddr which prevents the code for working at compile-time. It may be that the new {.byaddr.} pragma helps: https://github.com/SciNim/megalo/blob/9e7effb/megalo/op_addsub.nim#L41
When I suspended my work, IIRC substraction still had a bug linked to the negative/non-negative/ a > b / b < a
Basic multiplication is done, Karatsuba-Comba can be added later, only thing left is properly setting the negative flag.
Modulo and division needs to be implemented. There is no sign problem (https://github.com/status-im/nim-stint/blob/9e49b00/stint/private/int_div.nim#L54-L58) but the algorithm are more complex.
Conversion to/from decimal strings.
adding CI
adding documentation (though BigInt are pretty much self documenting)
publish on nimble

I think this is less work than figuring out the intricacies of wrapping C++, and the last 3 items have to be done anyway for a wrapper.

Later, optimizing to reach decent performance at minimum requires:

copy-paste assembly for addition/substraction
add Karatsuba (divide-and-conquer) and dispatch the base 2, 3, 4, 5 and 6 sub-limbs case to constantine assembly: https://github.com/mratsim/constantine/blob/e89429e/constantine/arithmetic/assembly/limbs_asm_mul_x86.nim#L112-L141

Dec 24 '20 10:12 mratsim

I recently learned from Jonathan Protzenko (Microsoft Research) that one of his colleagues, Marina Polubelova, is working on an F* (formally verified) implementation of bignum arithmetic, which compiles to C. That could also serve as a good basis. (They release their work as Apache 2)

https://github.com/project-everest/hacl-star/tree/polubelova_bignum

Dec 24 '20 12:12 niekbouman

much faster than GMP: https://bluescarni.github.io/mppp/integer_benchmarks.html

BTW, those benchmarks are a bit cherry-picking; as it only shows 1 and 2-limb sizes.

As it was already concluded in this discussion that GMP is unsuitable for licensing reasons, it could still be interesting (for the sake of inspiration) to have a look at the 'generic' (non-asm) implementations of the mpn api (https://gmplib.org/manual/Low_002dlevel-Functions), and at this post by Fredrik Johansson: https://gmplib.org/list-archives/gmp-devel/2018-April/004853.html He basically says in that post that mpn's mpn_mul is slow for few limbs as it goes through many (runtime) case distinctions, which makes it slow. In Nim, such case distinctions could of course be handled at CT with when. The way out (when using the mpn api) is to use mpn_sec_mul (schoolbook only) or the internal (not part of public api) mpn_mul_basecase.

Dec 24 '20 13:12 niekbouman

Haskell comes with BigInt now.

Dec 31 '20 13:12 juancarlospaco

Nim Nim copied to clipboard

Add bigints to standard library

Summary

Solution

proposal

benchmark: computing factorial(100) 1 million times

D:

Big-Integer-C:

python

nim + pkg/bigints

nim + pkg/stint

EDIT: nim + pkg/bignum

links

alternative not evaluated: openssl bn

Nim
Nim copied to clipboard

benchmark: computing `factorial(100)` 1 million times