haxe icon indicating copy to clipboard operation
haxe copied to clipboard

Add BigInteger type

Open flashultra opened this issue 3 years ago • 30 comments

This pull request add BigInteger type in Haxe . The implementation is created by Chuck Batson and I have his permission to relicense his code under the Haxe license ( thank you @cbatson ! ).

Some things about the current implementation. I compare three BigInteger implementations - on Chuck ( https://github.com/cbatson/hxBitcoin/tree/master/com/fundoware/engine/bigint) , in haxe-crypto lib ( https://github.com/Jens-G/haxe-crypto/blob/master/src/com/hurlant/math/BigInteger.hx ) and littleBigInt ( https://github.com/maitag/littleBigInt/blob/master/src/BigInt.hx ).

The Chuck implementation give about 5x better performance and also is better than native Java implementation ( 3x times better !).

As a additional info - many languages have a BigInteger , such as Java, C#, Javascript

flashultra avatar Jul 07 '22 06:07 flashultra

Thanks, this would be useful to have in the standard library indeed. There are a couple of high-level comments I have (I haven't yet looked through all of the implementation):

  • For targets that support big integers natively (which you mention at the end of your description), can we use the native implementation? If it is in fact slower, maybe we can have a define to switch between the two versions, similar to the JSON define?
  • There should be tests, for all the API methods.
  • We might need other types to be aware of BigInt. What about the seraliser? What about additional methods to read/write in Input/Output?
  • Could @cbatson himself comment on whether the permission was indeed given to relicense the code?

Aurel300 avatar Jul 07 '22 10:07 Aurel300

Thank you for reviewing this pull request. About the comments:

  • Well, this is possible, but different implementation can expose different methods, so some methods can exist only in Java, but not in Javacript , c# or to have a different names ( example isProbablePrime, modPow, ModPo ..). So that change will require to match any specific implementation for each language or to expose only main methods which exist in all implementations. Not sure how feasible is that solution .As I mentioned for some targets the current implementation give a better results.
    However , in the current implementation are missing some important methods such as pow , modPow and etc. , so I can try to add it.

  • Yes, I should add test. Maybe after adding the missing methods .

  • Yes, I had not thought for that, but that is valid point. At the moment is possible to convert BigInt to haxe.io.Bytes , String , Vector<Int> , which give some possibilities for serialisation .

  • His permission is public and available here : https://github.com/cbatson/hxBitcoin/issues/4#issuecomment-1176441329 I'm sure he can comment too when have a good internet connectivity

flashultra avatar Jul 07 '22 11:07 flashultra

I think this pull request can be reviewed and merged as I added the missing methods ( abs, pow, modPow, isProbablePrime, getLowestSetBit and bitLength) and all the tests for them. Regarding the other questions : 1) BigInt native implementation Currently, only three targets supports BigInt type natively - Java, C# and Javascript . For the first two it contains many methods ( pow, modpow, prime ...) , but for Javascript it's just an empty class that will required other methods to be implemented using javascript's native BigInt type .I'm not sure how feasible it is since the current implementation stores data in a Vector<Int>. For C# and Java - they do not support Muttable BigInt type (this PR adds immutable and mutable types - BigInt and MutableBigInt) . It might be possible to add only muttable as native type, but maybe in the future if someone is interested in that. 2) Serializer There are several solutions for that - for example, like adding BigInt as a separate type to the Serializer class, perhaps similar to StringMap, IntMap, ObjectMap, which would require a change to the Serializer class, or simply serializing the BigInt as a class or bytes (there is a method to convert to bytes) . This may be a separate PR in the future, and as I mentioned, it is currently possible to serialize a BigInt to Bytes. 3) Input/ Output Internally BigInt keep the data in Vector<Int>. This could be ease converted to Bytes. So I think it's possible someone to get the data and use it in Input/Output methods . Another solution could be adding new methods writeBigInt / readBigInt ( with keeping the size of the Bigint ) , which will require change in Input /Output class in separate PR.

flashultra avatar Jul 16 '22 13:07 flashultra

Yes, this does indeed have my blessing. Hope it's a useful addition. 👍

Thank you @flashultra for doing this work!

cbatson avatar Jul 18 '22 05:07 cbatson

Php tests fails on another test. I think the bigint tests should pass the Php target.

flashultra avatar Jul 22 '22 19:07 flashultra

All tests are green except PHP ( and mac-cpp which failed with other reason) . For php test I tried to run on my server , but I've got the following error which is not related with BigInt. I think it's safe to be merge this pull request.

Stack trace:
#0 /var/www/lib/Array_hx.php(11): php\Boot::php\{closure}()
#1 /var/www/index.php(9): include_once('...')
#2 /var/www/lib/unit/Test.php(629): {closure}()
#3 /var/www/lib/unit/Test.php(644): unit\Test::__hx__init()
#4 /var/www/index.php(9): include_once('...')
#5 /var/www/lib/unit/TestBigInt.php(23): {closure}()
#6 /var/www/index.php(9): include_once('...')
#7 /var/www/lib/BigIntTest.php(16): {closure}()
#8 /var/www/index.php(14): BigIntTest::main()
#9 {main} in /var/www//lib/Array_hx.php on line 11
PHP Fatal error:  During inheritance of JsonSerializable: Uncaught ErrorException: Return type of Array_hx::jsonSerialize() should either be compatible with JsonSerializable::jsonSerialize(): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /var/www/lib/Array_hx.php:267

flashultra avatar Jul 23 '22 05:07 flashultra

The above error is when I test with PHP 8.1.
I ran a new test with PHP 7.3 and all tests passed. This PR is ready to be merged.

flashultra avatar Jul 23 '22 13:07 flashultra

Two things can be improved in the future :

  1. Better GCD algorithm (using Lehmer's GCD)
  2. Faster detection of prime numbers ( for isProbablePrime() and nextProbablePrime() methods)
  3. Faster modPow implementation

I won't be adding any more methods , so this is the final version of this PR for now.

flashultra avatar Jul 26 '22 13:07 flashultra

It would be nice to be credited in the source as the original author. But if that's not possible due to license reasons, then I'm ok with that. Thank you.

cbatson avatar Jul 31 '22 06:07 cbatson

As I'm not a main contributor in this repository I can't comment if this is possible or not.
For all new files I put the standard license available in the current source code. Maybe someone else could shed some light of that ?

flashultra avatar Jul 31 '22 08:07 flashultra

Regarding attribution, there is an example of this in some of the map implementations, e.g. IntMap on C#. I don't know if the "copyright goes to" language makes sense seeing as the library should be MIT licensed, but mentioning the original author is fine.

Also @flashultra, about native implementations: I understand that the various targets' APIs differs from the exact API you are creating with this BigInteger type. But this can be solved by an abstract, which by default just wraps a Haxe implementation (as in this PR), but on some targets wraps the native type. Do you see what I mean?

// by default
abstract BigInteger(BigIntegerImpl) {
  public function foo():Void { this.foo(); }
  public function bar():Void { this.bar(); }
}

// override for Java
abstract BigInteger(java.math.BigInteger) {
  public function foo():Void { this.someOtherNameForFoo(); }
  public function bar():Void { /* Haxe implementation of bar that uses methods of Java's BigInteger */ }
}

Anyway, this kind of thing really only makes sense if such (semi-)native implementations would outperform the Haxe version.

Aurel300 avatar Jul 31 '22 09:07 Aurel300

Thank you @Aurel300 for your comment. If @cbatson wants, I can add in each file something like /*Thanks to Charles H. Batson III , the original author of this implementation */ after the import ( similiar to IntMap) or could be something else ?

About the native implementation, there definitely make sense for Java as my implementation for modPow() is about 20 times slower than the native one , and gcd() is about 3 times slower. Maybe C# has similar performance. However some other operations are faster on the current implementation than the native one. If modPow() is improved in the future (semi-)native implementations would not be needed

flashultra avatar Jul 31 '22 10:07 flashultra

Ok thank you @Aurel300 and @flashultra. I don't want/need to claim a copyright. Something like what @flashultra proposes or even a simple "Original code courtesy Chuck Batson (github.com/cbatson)" will be very much appreciated!

cbatson avatar Jul 31 '22 23:07 cbatson

@flashultra for the php job to succeed you may need to specify -d memory_limit=-1 here:

https://github.com/HaxeFoundation/haxe/blob/614362f2e090d1ad90dd59656f6cbed77ab2c60d/tests/runci/targets/Php.hx#L88-L95

sebthom avatar Mar 22 '23 17:03 sebthom

@sebthom Thank you for your help. Cppia still fails. ( https://github.com/HaxeFoundation/haxe/actions/runs/4497491663/jobs/7913155711#step:11:2791 ) . Do you have any idea what the problem could be?

flashultra avatar Mar 23 '23 19:03 flashultra

"core dumped" sounds like a segmentation fault.

To be honest, I had no good experiences with Cppia myself. A lot of random crashes with it while things worked fine on all other targets. I gave up trying to support it in my projects.

maybe @hughsando has some ideas?

sebthom avatar Mar 23 '23 20:03 sebthom

There is some problem with CI. Since I can't cancel the CI job, please someone to cancel CI #4960 as the new one (CI #4962) has been set.

Some thoughts on this pull request. At the moment multiplication is very slow because it uses 16-bit operations. Replacing it with 32-bit requires using Int64, but that's even slower in the tests I've done Proper(native) Int64 for each target is required ( many targetd have such type - C,C++,C#, Java , JS(bigint) etc.). The new modPow() method gives a better result, but is still slow (for the reason mentioned above).

Any futher optimization will require a native Int64 type and/or Toom-Cook / Schonhage-Strassen implementation.

flashultra avatar Jan 07 '24 17:01 flashultra

CI is just completely broken at the moment it seems.

Simn avatar Jan 07 '24 17:01 Simn