plutus icon indicating copy to clipboard operation
plutus copied to clipboard

On-chain function to convert from Integer to ByteString, or to parse ByteString into Integer

Open longngn opened this issue 2 years ago • 15 comments

Area

  • [x] Plutus Foundation Related to the GHC plugin, Haskell-to-Plutus compiler, on-chain code
  • [ ] Plutus Application Framework Related to the Plutus application backend (PAB), emulator, Plutus libraries
  • [ ] Marlowe Related to Marlowe
  • [ ] Other Any other topic (Playgrounds, etc.)

Describe the feature you'd like

  • On-chain function to convert from Integer (or IsData instance) to ByteString
  • On-chain function to parse ByteString into Integer (or IsData instance)

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context / screenshots

Add any other context or screenshots about the feature request here.

longngn avatar Jul 31 '21 09:07 longngn

Hi, thanks @longngn. This is actually already being worked on to allow for Marlowe scalability improvements. I'll link a PR once there's one ready.

catch-21 avatar Aug 02 '21 15:08 catch-21

Hi, thanks @longngn. This is actually already being worked on to allow for Marlowe scalability improvements. I'll link a PR once there's one ready.

Thank James for working on this feature! Is it just serialize/deserialize Integer or any arbitrary IsData instance?

longngn avatar Aug 04 '21 13:08 longngn

This might also fix #4168 .

Is there any update on this?

L-as avatar Oct 29 '21 14:10 L-as

Actually I've written an Integer -> BS function for our internal contracts. Happy to write the reverse direction (BS -> Integer) and property-based testing and to create a PR

-- Convert from an integer to its text representation. Example: 123 => "123"
{-# INLINEABLE integerToBS #-}
integerToBS :: Integer -> BuiltinByteString
integerToBS x
  -- 45 is ASCII code for '-'
  | x < 0 = consByteString 45 $ integerToBS (negate x)
  -- x is single-digit
  | x `quotient` 10 == 0 = digitToBS x
  | otherwise = integerToBS (x `quotient` 10) <> digitToBS (x `remainder` 10)
  where
    digitToBS :: Integer -> BuiltinByteString
    -- 48 is ASCII code for '0'
    digitToBS d = consByteString (d + 48) emptyByteString

longngn avatar Oct 29 '21 15:10 longngn

Thanks! Ah, you meant ASCII? I thought this issue was for getting a binary representation of it.

L-as avatar Oct 29 '21 15:10 L-as

+1 for binary representation as we're trying to enable efficient cryptographic behaviors.

Benjmhart avatar Nov 25 '21 14:11 Benjmhart

Thanks! Ah, you meant ASCII? I thought this issue was for getting a binary representation of it.

As you point out here, there is no unique Integer -> ByteString mapping. You have to pick an encoding. So any builtin that we picked would be for a particular encoding. So we'd need to pick a particular one.

michaelpj avatar Nov 25 '21 15:11 michaelpj

Isn't there only one obvious one? I.e. a base 256 representation.

L-as avatar Nov 25 '21 19:11 L-as

No, there are many:

  • Big/little-endian
  • MSB/LSB-first

michaelpj avatar Nov 26 '21 09:11 michaelpj

Little-endian is probably best because that's probably what most of us are used to dealing with.

L-as avatar Nov 26 '21 17:11 L-as

Perhaps we can use ZigZag encoding to encode variable-length signed integer like in Plutus Core Base-256 is only possible for unsigned integer

longngn avatar Nov 26 '21 18:11 longngn

Describe the feature you'd like

On-chain function to convert from Integer (or IsData instance) to ByteString

We now have

  1. serialiseData :: BuiltinData -> BuiltinByteString
  2. a Show Builtins.Integer instance that you can use together with encodeUtf8

I don't know if we have any parsing capabilities.

@zliu41 do you happen to have any input on this issue?

effectfully avatar Feb 22 '23 20:02 effectfully

Requests for adding new builtin functions should be discussed in https://github.com/cardano-foundation/CIPs.

For Plutus Tx library functions, my main concern is that converting integers to/from bytestrings without using new builtins, regardless of encoding, can be quite expensive (perhaps except serialiseData . mkI). I don't want to add a library function and make users think that they can just use it casually in their scripts (we do have the Show class, but it is intended for debugging). If we make it clear it is expensive, maybe it is OK.

zliu41 avatar Feb 22 '23 22:02 zliu41

To iterate on what @michaelpj and @zliu41 said, the choices that we have are:

  1. add new builtins (tailored ones, or just deserialiseData). This should be done via https://github.com/cardano-foundation/CIPs
  2. implement encoding/decoding as regular functions by making arbitrary choices regarding the encoding. Those functions aren't going to be efficient and we're worried about having to handle complaints, optimize the functions and generally spend a lot of time on it. In particular, just looking at @longngn implementation (thank you @longngn for providing it!) from the above I can spot a number of performance issues right away: lack of tail recursion, quadratic behavior due to <> and consByteString being linear (which we probably can't avoid), x `quotient` 10 computed twice, x < 0 unnecessarily computed at each step, maybe something else -- not to mention that this direct encoding is nicely visual, but not compact at all

So given the lack of consensus and what appears to be a low priority problem, I'm going to label the issue with "Low priority".

effectfully avatar Jun 20 '23 19:06 effectfully

or just deserialiseData

Incidentally, here is some context regarding why we don't have this builtin. Credit for digging that out goes to @kwxm.

effectfully avatar Jun 20 '23 19:06 effectfully