cadence
cadence copied to clipboard
Remove access to fields by index
Issue to be solved
Accessing fields by index (instead of by name) can create dependency on sort order.
For example, issue https://github.com/onflow/cadence/issues/2558:
Some client programs are accessing fields by index rather than name. This created a dependency on the sort order of fields which surfaced when we switched from JSON-CDC to CCF for encoding events because CCF explicitly sorts as specified in CCF specification for deterministic encodings.
PR https://github.com/onflow/cadence/pull/2559 resolved issue 2558 by updating CCF Codec to make CCF Determinisitic Encoding optional. It was a stop-gap solution to avoid abruptly breaking compatibility for encoded events.
For exported values, not allowing access to fields by index can help prevent Cadence developers from unintentionally creating dependency on sort order. This can prevent unintended consequences like blocking deterministic encoding, etc.
However, this issue can be viewed as broader topic than CCF, sort order, or access by field index.
Current Status
CCF encoding is not currently sorting the fields (in order to be compatible with old programs that access fields by index).
-
CCF specs (onflow/ccf) defines how to sort and makes it optional, so each CCF-based format can balance trade-offs by specifying which CCF options they use:
CCF defines "Deterministic CCF Encoding Requirements" and makes it optional.
-
CCF codec (onflow/cadence) offers sorting as non-default option, but sorting isn't being used (e.g. for events encoding) due to compatibility reasons mentioned in:
- #2558
- #2559
I didn't find any programs using the sort option in CCF codec (as of Nov 28, 2023).
Suggested Solution
Programming languages before 1.0 can benefit from taking away the ability for programmers to (sometimes unintentionally) create dependencies on implementation details of compilers and interpreters.
For example, a few months before Go 1.0 (~12 years ago), https://codereview.appspot.com/5285042/ added a random offset for map iteration and changed Go language specification.
Index: doc/go_spec.html
===================================================================
--- a/doc/go_spec.html
+++ b/doc/go_spec.html
@@ -4085,7 +4085,8 @@
</li>
<li>
-The iteration order over maps is not specified.
+The iteration order over maps is not specified
+and is not guaranteed to be the same from one iteration to the next.
If map entries that have not yet been reached are deleted during iteration,
the corresponding iteration values will not be produced. If map entries are
inserted during iteration, the behavior is implementation-dependent, but the
That change to Go intentionally broke some existing Go programs by taking away existing behavior that wouldn't be ideal to officially or unofficially support forever after Go 1.0 release.
In the same way, maybe we can determine if some existing features (accessing field by index) should be removed before Cadence 1.0 release.
EDIT: added Current Status section about CCF for more context.
Problem:
cadencepackage's composite value Go types, likeStruct,Resource, etc have aFieldsfield, an array of the fields' values- The order previously matched the order of how the fields are declared in the code
- CCF encoding now sorts the fields
- Developers assumed the order in the
Fieldsarray matches the order of how the fields are declared in the code, so they index directly into it using constants - However, this will break once field values are sorted
Considered:
- Restoring the ordering fixes the issue, but leaves the brittle code: Future changes to type in deployed contract might break server-side decoding based on hard-coded ordering
- Code should have performed a lookup by name in composite type' fields
StructType'sFieldsarray of name/type pairs
Proposed fix:
- Unexport
Fieldsin composite value types (Struct,Resource, etc.) - Introduce field value lookup function
FieldByName(name string) Valuefor composite value types - The helps developers write less brittle code (removes assumption about order of fields in deployed code)
Doesn't this lose information about ordering? Let's say it is an event signature from past.
@fxamacker I assume both values and type information will be re-ordered?
@fxamacker I assume both values and type information will be re-ordered?
@turbolent Yes. Both would be re-ordered, and maybe both should be unexported.
@bluesign when would the order be required to know? Like I mentioned in my last comment, code that relies on the order is brittle anyways and should probably be updated
@bluesign when would the order be required to know? Like I mentioned in my last comment, code that relies on the order is brittle anyways and should probably be updated
Yeah I agree on that part, but we had similar problem with event encoding order before. I think latest victim was @bjartek he can chime in.
But main problem is; if we make old code ( pre 1.0 ) compatible with new deterministic ccf, then there is a risk of breaking silently. If it breaks with new encoding with some error etc, it is ok I guess.
PS: I still have a feeling order of things carry a meaning ( in event signatures for example ) maybe I am thinking too much objective-c like in my head.
Pre candidate7 the order of event field names was done alphabetically and not in definition order. Luckily it obly affected topshot and i made a manual fix
Not sure if that last comment is relevant to this thread though. Ss long as the name of fields and content of fields are ordered correctly so they match I have no issue with it.
I should have been more clear in my comment, I think order still has importance but ignoring that my main issue here is, if we have data in 2 versions ( non-sorted D0 and sorted D1 ) and functions to access ( F0 and F1 )
- F1(D0) and F1(D1) will be ok, F0(D0) will be still ok ( as it is currently ), but the problem is F0(D1) will fail silently. ( that's why I gave example of old events in candidate7, they fail at some time silently too )
I think this is important thing to consider as Cadence is not backwards compatible, while accessing old data we need matching Cadence version.
But main problem is; if we make old code ( pre 1.0 ) compatible with new deterministic ccf, then there is a risk of breaking silently. If it breaks with new encoding with some error etc, it is ok I guess.
The plan isn't to make pre-1.0 compatible with new behaviour. Just like before, when interacting with the chain or its data, the appropriate Cadence version must be used (I know this isn't ideal, just pointing out the fact).
Doesn't this lose information about ordering? [...]
PS: I still have a feeling order of things carry a meaning ( in event signatures for example ) [...]
Events are just composites, so their field order is insignificant. We're probably not pointing that out sufficiently in documentation.
What probably adds to the assumption that event fields have some sort of order is that the syntax for event declarations is different from other composite declarations, and thus kind of suggests an order like in a parameter list.
Hyrum's law strikes again. This post from a couple days ago is very relevant: https://twitter.com/jimmykoppel/status/1744544779138978006
I think "events are composites" is a bit unfair statement especially with the last link you gave (which focuses on observability)
"Events are composites" is not observable to the user of Cadence, basically it is an implementation detail. Alternatively they could be implemented as non cadence type even. You cannot instantiate in normal cases without emit, or save them etc.
As I pointed before, from outside perspective, they are function signatures ( constructor more likely ) so order of things matter ( as we don't have named arguments )
This is not so important ( in this case ) of course.
@fxamacker is this purely related to CCF or also an problem with the introduction of the atree register inlining?
@fxamacker Can you please verify that #3290 is sufficient? Or do we also need to unexport field types?
@fxamacker Can you please verify that https://github.com/onflow/cadence/pull/3290 is sufficient? Or do we also need to unexport field types?
@turbolent I can probably take a look tomorrow at CCF after looking more closely into migration issue https://github.com/onflow/cadence/issues/3288 while it is still fresh.
BTW, I'm not sure if everyone already knows about CCF codec support for different "modes" at runtime. A single program can support different CCF-based formats, versions, etc. by using modes. A codec's mode is a set of encoding or decoding options made immutable during startup for concurrent use later.
// Programs can create and choose different CCF encoding modes (and decoding modes) at runtime.
// CCF modes are safe for concurrent use (e.g. create modes at startup and reuse from multiple goroutines).
if fooVersion == 0 {
// Keep original sort order of fields provided by Cadence.
b, err := ccfFoofUnsortedMode.Marshal(v)
} else if foolVersion == 1 {
// Sorts fields as specified in CCF Deterministic Encoding Requirements.
b, err := ccfFooSortedMode.Marshal(v)
} // else if etc.
CCF codec modes work like CBOR codec modes described at fxamacker/cbor Quick Start .
@fxamacker Can you please verify that https://github.com/onflow/cadence/pull/3290 is sufficient? Or do we also need to unexport field types?
@turbolent I think we should also unexport field types for consistency with field values.
We can unexport field type and provide a new function FieldTypes() that returns map[string]Type. This way, there isn't any implied ordering for either field definitions or field values. It also gives us flexibility to change implementation details without breaking API.
@fxamacker Sounds good, I was thinking the same. I'll work on that tomorrow
@turbolent BTW, in PR #3290 it seems FieldsMappedByName() includes attachment as a field if present, while SearchFieldByName() only looks up field type definition. Is it OK for these two functions to differ in this way?
Sorry about reposting this comment :pray: I accidentally deleted it from merged PR 3290 and moved it here.
@fxamacker Sounds good! I had started trying to change this, but it looks like the CCF codec relies on being able to set the type slice. Maybe we can have a short sync (not urgent, sometime next or the following week), to see how we could refactor this?