gmpy icon indicating copy to clipboard operation
gmpy copied to clipboard

convert to and from binary

Open thestick613 opened this issue 7 years ago • 8 comments

Where did this functionality go? I'm aware of gmpy2.from_binary and gmpy2.to_binary, but i need the gmpy behavior. I want to speedup some RSA sign and verify code.

In [1]: from gmpy import mpz

In [2]: n = mpz(12345)

In [3]: n.binary()
Out[3]: '90'

In [4]: mpz('90', 256)
Out[4]: mpz(12345)

In [5]: from gmpy2 import mpz

In [6]: n = mpz(12345)

In [7]: import gmpy2

In [8]: gmpy2.to_binary(n)
Out[8]: '\x01\x0190'

In [9]: gmpy2.from_binary('\x01\x0190')
Out[9]: mpz(12345)

In [10]: mpz('90', 256)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-10-6d09b478991e> in <module>()
----> 1 mpz('90', 256)

ValueError: base for mpz() must be 0 or in the interval 2 ... 62

thestick613 avatar Sep 28 '16 13:09 thestick613

A couple comments and then a question or two...

gmpy had a confusing (at least to me) variety of ways to convert values to/from a neutral binary format. Each type had a .binary() method. There were three different module level functions: gmpy.binary(), gmpy.qbinary(), and gmpy.fbinary(). To reconstruct a value, the program needed to already know the resultant type to call the proper factory function.

Since gmpy2 would remove the mpf type and add the mpfr and mpc types, I developed to_binary() and from_binary() to be the standard tools for serializing/deserializing the gmpy2 types.

gmpy2 still has functions that can read the old binary formats.

>>> gmpy2.mpz_from_old_binary('90')
mpz(12345)

But this won't help if you need to convert to the legacy format.

I'm trying to understand exactly where your input originates. Are you reading a sequence of bytes from a data source and then converting them to an mpz?

If yes, would access to the GMP library functions mpz_import and mpz_export work? Those functions convert arbitrary data to/from the mpz type. The data can be structured by 1, 2, 4, or 8 byte chunks, with different endian formats, etc.

I've had another request to add those two functions. Since the old gmpy binary format can be trivially handled by mpz_import/mpz_export, I'd rather add those two functions.

casevh

casevh avatar Sep 29 '16 04:09 casevh

I just read some binary data and need it converted to an integer. I'm trying to speed up some RSA code from dkimpy. Especially int2str and str2int.

thestick613 avatar Sep 29 '16 04:09 thestick613

mpz_import and mpz_export would be perfect for that use. I'll try to get them added soon.

casevh avatar Sep 29 '16 04:09 casevh

Thanks. I can't use gmpy because of some inconsistencies. I'm pretty sure gmpy isnt' maintained and the documentation is rather incomplete on some things.

In [1]: import gmpy

In [2]: gmpy.mpz(127).binary()
Out[2]: '\x7f'

In [3]: gmpy.mpz(128).binary()
Out[3]: '\x80\x00'

In [4]: import gmpy2

In [5]: gmpy2.to_binary(gmpy2.mpz(127))
Out[5]: '\x01\x01\x7f'

In [6]: gmpy2.to_binary(gmpy2.mpz(128))
Out[6]: '\x01\x01\x80'

In [9]: gmpy.mpz('\x80', 256)
Out[9]: mpz(128)

In [10]: gmpy.mpz('\x80\x00', 256)
Out[10]: mpz(128)

thestick613 avatar Sep 29 '16 06:09 thestick613

Is there any progress with the aforementioned mpz inconsistencies? I am still noticing the "\x01\x01" appending on to_binary...

JasonSome avatar Jan 01 '20 16:01 JasonSome

Based on the original request, can you try the following not fully tested code?

def str2int(s):
    gmpy2.pack(list(bytearray(s, "cp437")), 8)

def int2str(n):
    str(bytearray(gmpy2.unpack(gmpy2.mpz(n), 8)), "cp437")

casevh avatar Feb 04 '20 05:02 casevh

All those functions gmpy2.to_binary(x) or int2str(x) are very very slow... And they still need formatting operation to be useful, removing header and padding fixed size. Even pure python int(x).to_bytes(32,"big") is faster. We really need a fast mpz_export with output size and endianess

sfornengo avatar Jan 30 '23 10:01 sfornengo

@sfornengo, could you try new to_bytes() method? It should be slightly faster than int(x).to_bytes() equivalent. E.g.:

$ python -m timeit -r11 -s 'from gmpy2 import fac;a=fac(100)' 'int(a).to_bytes(66)'
200000 loops, best of 11: 1.16 usec per loop
$ python -m timeit -r11 -s 'from gmpy2 import fac;a=fac(100)' 'a.to_bytes(66)'
500000 loops, best of 11: 759 nsec per loop

skirpichev avatar Nov 28 '23 09:11 skirpichev