plutus
plutus copied to clipboard
BuiltinByteString isn't opaque enough
Currently,
newtype BuiltinByteString = BuiltinByteString ByteString
The problem is that any user-written function that is strict in a BuiltinByteString (because of a bang pattern, seq, a strict fold, etc.) is likely to have the worker/wrapper transformation applied to it, tearing the ByteString into its constituents, which the Plutus compiler can't understand.
I believe the right solution is to make BuiltinByteString a datatype instead of a newtype:
data BuiltinByteString = BuiltinByteString ByteString
GHC might very well unpack the BuiltinByteString, but as long as it doesn't know that functions are strict in the underlying ByteString, it won't unpack that, and it should remain available to the Plutus compiler.
Aha, thanks, this is useful. I've been suggesting that people use -fno-strictness when WW rips up bytestrings, but this might help us avoid needing that so much.
I've started work on this, but I have a ways to go. To be really confident, I need to call GHC.Exts.lazy on each extracted ByteString.
To be really confident, I need to call GHC.Exts.lazy on each extracted ByteString.
Hmm, that sounds like it's going to be a pain. Where would we need to put such annotations?
It's not that bad. Just things like
BuiltinByteString (lazy -> bs1) <> BuiltinByteString (lazy -> bs2) = BuiltinByteString $! bs1 <> bs2
-- or
BuiltinByteString bs1 <> BuiltinByteString bs2 = BuiltinByteString $! lazy bs1 <> lazy bs2
I like the first way better because it's harder to make mistakes.
Hmm okay. In fact the only place we should need to do that is in the "fake" implementations of the builtin operations in Builtins.Internal. Any other code that looks inside BuiltinBytestring simply isn't usable on-chain (and those functions should get replaced anyway - the main point would be to prevent GHC from trying to get clever with them).
I also tried
newtype BuiltinByteString = UnsafeBuiltinByteString Any
approach, and it didn't work as then plutus-tx-compiler found Any instead of Addr#.
That would allowed keeping the representation the same.
That said, few comments:
A pattern synonym would make lazy stuff simpler.
data BuiltinByteString = BuiltinByteString' ByteString
pattern BuiltinByteString :: ByteString -> BuiltinByteString
pattern BuiltinByteString x <- (lazyUnwrap -> x)
where BuiltinByteString x = strictWrap x
lazyUnwrap :: BuiltinByteString -> ByteString
lazyUnwrap (BuiltinByteString' (lazy -> bs) = bs
strictWrap :: ByteString -> BuiltinByteString
strictWrap !bs = BuiltinByteString' bs
I noticed that Generic instance is used for NoThunks instance,
I don't like that. Writing NoThunks instance manually would be better.
(There's not much to write in this case).
While making this mistake, I got errors like:
src/Plutus/ChainIndex/Handlers.hs:140:22: error:
Not in scope: data constructor ‘BuiltinByteString’
Perhaps you want to add ‘BuiltinByteString’ to the import list
in the import of ‘PlutusTx.Builtins.Internal’
(src/Plutus/ChainIndex/Handlers.hs:63:1-79).
|
140 | getTxFromTxId (TxId (BuiltinByteString txId)) =
|
which makes me uneasy. BuiltinByteString seems to be in "core" datatypes
used by something where efficiency probably matters. Having an additional
box doesn't feel great. But I don't have a better option,
except Backpack to duplicate these types.
That might be very bad if Integer has to be wrapped too for the similar
reasons.
If I disable a worker wrapper transformation, I get
{ Data.ByteString.Internal.PS _ [Occ=Dead] _ [Occ=Dead]
_ [Occ=Dead] _ [Occ=Dead] ->
case matches. It would be an improvement if plutus-tx-plugin would figure out that bindings are unused (there is occurence hint!), so there is no need to compile ByteString data-type, nor its field types.
It would be nice if worker-wrapper could be disabled just for ByteString, but I don't think that GHC has such precise knob.
EDIT: and it seems that in some scenarios disabling worker wrapper transformation actually makes PLC smaller, or the increase in size is small-ish.
@phadej, I don't trust that Any technique; if GHC sees that it's coercing X -> Any -> X it'll simplify the coercion away. Using an explicit box and being lazy with its contents should be quite reliable; the outer box can be removed by worker/wrapper. The inner box should only affect off-chain code, and it shouldn't be too bad.
if GHC sees that it's coercing X -> Any -> X it'll simplify the coercion away.
Isn't that good, kind of what a successful worker-wrapper transformation would do?
But it doesn't matter, as it doesn't work.
If I can prevent the ByteString from being torn apart by worker-wrapper, what might I have to change in the Plutus compiler to let it work with either a BuiltinByteString or a ByteString? Or will that be hard? Here's another option that hopefully doesn't have that issue:
data BuiltinByteString
= BuiltinByteString {-# UNPACK #-} !ByteString
| Never !Void
To avoid fragility, we'll have to use -fno-worker-wrapper and/or -fno-cpr-anal in the Builtins module, but that's not nearly as bad as what we do now. I'm not 100% sure that the strict Void argument can't let GHC mess with us across module boundaries, but I haven't been able to get it to do so, even with aggressive constructor specialization.
We moved away from using ByteString precisely to make it harder for people to use functions on ByteStrings that won't work, and then they get horrible errors. We want an opaque type. But we also want to be able to run the functions that get compiled on-chain off-chain as well, which is why we do back it by a real ByteString and provide implementations of the primitives.
I've lost track a little bit, what went wrong with just making it data?
@michaelpj, the tricky thing about just making it data is that we'd then have to be prepared for the possibility that the constructor is unwrapped, and also the possibility that the constructor is not unwrapped. I just don't know how hard that will be to deal with. I'm about a minute from opening a PR that does it the sum type way. It'll be nice to see if that builds properly.
I also have a version that tries to do the "just change to data" thing, but I am less optimistic that that'll work without some modification to the Plutus compiler. I can upload it anyway if you like.
https://github.com/input-output-hk/plutus/pull/4393 aimed to fix this but still seeing problem in Vasil version https://github.com/input-output-hk/plutus/commit/a56c96598b4b25c9e28215214d25189331087244
Run cabal build with this broken branch to recreate issue when using !TokenName
See this StackExchange question for the compilation error.
Idea: What if we do
-- Maybe better if open? That way GHC might think it might have an instance?
type family Opaque :: Type -> Type where
{-# OPAQUE toOpaque #-}
toOpaque :: a -> Opaque a
toOpaque = unsafeCoerce
{-# OPAQUE fromOpaque #-}
fromOpaque :: Opaque a -> a
fromOpaque = unsafeCoerce
type BuiltinByteString = Opaque ByteString
Would this fix the issue?
@L-as
Idea: What if we do
That is more or less what I proposed with Any
(EDIT: you want a newtype wrapper so you can write instances)
I don't think we'd run into the same error, because the compiler would see Opaque ByteString instead of BuiltinByteString, which is usable while Any is not.
... but you want be able to write any instances for a type family application... you need a (newtype) wrapper.
You just do newtype BuiltinByteString = BuiltinByteString (Opaque ByteString) as you noted yourself? I don't see the issue.
@L-as and then you essentially get newtype BuiltinByteString = BuiltinByteString Any, though different "Any" for different builtin types. Maybe there's merit in that, but all e.g. @treeowl worries are still present (which I don't understand myself).
I caught the same error as described in the StackExchange question already linked above which seems related to this issue because of using exclamation signs (strict declaration) in my types.
I used cabal repl instead of cabal build which works fine with no issues and lets me write the script file. Shouldn't it be the same in sense of strictness?
Used plutus-apps tag "v1.0.0-alpha1"
Just an observation maybe it can help solve this issue.
I can implement my suggestion if you want @michaelpj
Is there a fix or workaround for this issue? I'm utilizing plutus-apps v1.0.0 tag.
@michaelpj what is the status of this issue? It seems like it's been bugging people for more than a year now, should we ask someone to take a deep look at it and at least summarize the options that we have?
Yes, it would be good for someone to take another look. I don't understand why our current solution isn't working and I'm reluctant to do something that makes it harder to work with :|
@michaelpj
Yes, it would be good for someone to take another look. I don't understand why our current solution isn't working and I'm reluctant to do something that makes it harder to work with :|
I've created a Jira ticket for that (PLT-5534).
@Fiftyw3bs
Is there a fix or workaround for this issue? I'm utilizing plutus-apps v1.0.0 tag.
Do you happen to have an MRE? I see that we have one linked from Stack Exchange, but not entirely sure if it's still reproduces the problem.
At the very least we should replace all the NOINLINE pragmas in the Plutus Tx builtins code with OPAQUE ones as that is a way more reliable way to prevent GHC from exposing the internals of a definition.