fix(gnovm): allow multiple NaN keys in map, alternative
This PR fixes #3867. It's an alternative to #3932.
- The method TypedValue.PrimitiveBytes is modified to PrimitiveMapKey and made more efficient using appends; this rename is done to make explicit that there are caveats used for map key computation.
- Floating point values which are
NaNare instead encoded as a 7-byte increasing counter; given thatComputeMapKeyis used to calculate the key forvmap, which is an unexported field which is not decoded from an existing map, it is existing to have this counter that is re-initialized at each startup (similarly to how we use uintptr's to compare pointers in maps, which would be non-determinstic but is safe in this context). - This guarantees that two map keys with NaN's may only be equal after cycling through $2^{56}$ values, which I don't see as likely.
- Negative 0 is now also normalized to
0, matching go's behaviour. - Map keys are now re-assigned when using
GetPointerForKeyon an existing key, in order to match Go's behaviour. - Like 3932, it doesn't add further complexity to map handling in all other cases.
- Unlike 3932, it detects NaN using the
softfloatpackage, rather than using hardware FP instructions, ensuring cross-architecture determinism.
Context (from this comment):
I made this as an alternative to [marc's] PR.
A few reasons:
- This still uses randomness, which is going to be non-deterministic across nodes and will create problems in the long run
- Also given that the set of possible nan values for a float 32 is ~8 million, which is very small.
Some nodes will hit the random limit condition sooner than the others. I don't think it's safe to have this.
I thought a second about how we can implement this; I think it can be done without having to change MapKey; just changing when we assign to
vmap; this way, there is only an additional function call + if for map assignments and for map deletions, while iteration and access doesn't require it.IMO, my implementation is simpler and more straightforward.
There is one problem where I think you can have good ideas / insights: https://github.com/gnolang/gno/pull/4114#issuecomment-2797529537 - it looks like for negative zero, behaviour is different between ours vs go's impl. Can you take a look?
🛠 PR Checks Summary
All Automated Checks passed. ✅
Manual Checks (for Reviewers):
- [ ] IGNORE the bot requirements for this PR (force green CI check)
Read More
🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers.
✅ Automated Checks (for Contributors):
🟢 Maintainers must be able to edit this pull request (more info)
☑️ Contributor Actions:
- Fix any issues flagged by automated checks.
- Follow the Contributor Checklist to ensure your PR is ready for review.
- Add new tests, or document why they are unnecessary.
- Provide clear examples/screenshots, if necessary.
- Update documentation, if required.
- Ensure no breaking changes, or include
BREAKING CHANGEnotes. - Link related issues/PRs, where applicable.
☑️ Reviewer Actions:
- Complete manual checks for the PR, including the guidelines and additional checks if applicable.
📚 Resources:
Debug
Automated Checks
Maintainers must be able to edit this pull request (more info)
If
🟢 Condition met └── 🟢 And ├── 🟢 The base branch matches this pattern: ^master$ └── 🟢 The pull request was created from a fork (head branch repo: thehowl/gno)Then
🟢 Requirement satisfied └── 🟢 Maintainer can modify this pull requestManual Checks
**IGNORE** the bot requirements for this PR (force green CI check)
If
🟢 Condition met └── 🟢 On every pull requestCan be checked by
- Any user with comment edit permission
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:loudspeaker: Thoughts on this report? Let us know!
There is a possible problem: this behaves differently in Go and Gno...
// You can edit this code!
// Click here and start typing.
package main
import (
"fmt"
"math"
)
func main() {
n0 := math.Float64frombits(1 << 63)
fmt.Println(n0)
m := map[float64]string{}
m[0] = "hey"
m[n0] = "world"
fmt.Println(m)
}
// Output:
// -0
// map[0:world]
In Go, the key that is used is actually the last: https://go.dev/play/p/K--afH1wh0L -- so the result is map[-0:world].
ALSO realised this doesn't work for ie. a struct with a float... we probably still need some modification to ComputeMapKey
This is now ready for review - I also fixed the other mentioned issues relating to maps.
Since you can't delete a NaN key I presume, couldn't we solve this with minimal changes? There's already a List, so the insertion order is what would make it deterministic?
like
diff --git a/gnovm/pkg/gnolang/values.go b/gnovm/pkg/gnolang/values.go
index 04c0ce916..a4fc3b529 100644
--- a/gnovm/pkg/gnolang/values.go
+++ b/gnovm/pkg/gnolang/values.go
@@ -723,6 +723,7 @@ func (mv *MapValue) GetLength() int {
// GetPointerForKey is only used for assignment, so the key
// is not returned as part of the pointer, and TV is not filled.
func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedValue) PointerValue {
+ // If NaN, instead of computing map key, just append to List.
kmk := key.ComputeMapKey(store, false)
if mli, ok := mv.vmap[kmk]; ok {
return PointerValue{
@@ -743,6 +744,7 @@ func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedVa
// Like GetPointerForKey, but does not create a slot if key
// doesn't exist.
func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue, ok bool) {
+ // If key is NaN, return default
kmk := key.ComputeMapKey(store, false)
if mli, exists := mv.vmap[kmk]; exists {
fillValueTV(store, &mli.Value)
@@ -752,6 +754,7 @@ func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue
}
func (mv *MapValue) DeleteForKey(store Store, key *TypedValue) {
+ // if key is NaN, do nothing.
kmk := key.ComputeMapKey(store, false)
if mli, ok := mv.vmap[kmk]; ok {
mv.List.Remove(mli)
I replied to this review on Signal a while back, cross-posting here:
Text-based version, for searchability
The idea behind the PR was to follow in the footsteps of what Marc did, which I think largely makes sense
And that is not to change the code path, ie. adding conditionals or slowing down code anyway, for the 99.9999% of cases that don't have NaNs
Also keep in mind any solution has to work not just for map[float64]V, but also for map[struct{k float64}]V
So you need a "deep" nan check for those cases, which is potentially expensive for large structs. You can make this cheaper by returning a second param in ComputeMapKey, but again, this makes code more complex to account for the 1/1000000th case where there is a nan
because then you need to add a check for the second param each time
// If NaN, instead of computing map key, just append to List.
Thanks, that makes sense. Having ComputeMapKey return a second bool isn't bad, see below: The benefit of this is that it's clear that anything with NaN doesn't have a map key. It's slightly slower than otherwise, but map operations are slow anyways so it's negligible. And your optimization of passing in bz can still be utilized, but would rather have it on top of the following...
(I like your solution, but really, the slowdown here is neglibible compared to map operations already! so why not go for the more perfect solution without mental overhead? imagine somebody having to understand the nanCounter vs seeing ComputeMapKey return false.)
diff --git a/gnovm/pkg/gnolang/realm.go b/gnovm/pkg/gnolang/realm.go
index 4f1626a66..71429c726 100644
--- a/gnovm/pkg/gnolang/realm.go
+++ b/gnovm/pkg/gnolang/realm.go
@@ -1434,8 +1434,10 @@ func fillTypesOfValue(store Store, val Value) Value {
for cur := cv.List.Head; cur != nil; cur = cur.Next {
fillTypesTV(store, &cur.Key)
fillTypesTV(store, &cur.Value)
-
- cv.vmap[cur.Key.ComputeMapKey(store, false)] = cur
+ mkey, nan := cur.Key.ComputeMapKey(store, false)
+ if !nan {
+ cv.vmap[mkey] = cur
+ }
}
return cv
case TypeValue:
diff --git a/gnovm/pkg/gnolang/values.go b/gnovm/pkg/gnolang/values.go
index a4fc3b529..73be21ceb 100644
--- a/gnovm/pkg/gnolang/values.go
+++ b/gnovm/pkg/gnolang/values.go
@@ -724,7 +724,8 @@ func (mv *MapValue) GetLength() int {
// is not returned as part of the pointer, and TV is not filled.
func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedValue) PointerValue {
// If NaN, instead of computing map key, just append to List.
- kmk := key.ComputeMapKey(store, false)
+ kmk, nan := key.ComputeMapKey(store, false)
+ if !nan {
if mli, ok := mv.vmap[kmk]; ok {
return PointerValue{
TV: fillValueTV(store, &mli.Value),
@@ -732,6 +733,7 @@ func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedVa
Index: PointerIndexMap,
}
}
+ }
mli := mv.List.Append(alloc, *key)
mv.vmap[kmk] = mli
return PointerValue{
@@ -745,7 +747,7 @@ func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedVa
// doesn't exist.
func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue, ok bool) {
// If key is NaN, return default
- kmk := key.ComputeMapKey(store, false)
+ kmk, nan := key.ComputeMapKey(store, false)
if mli, exists := mv.vmap[kmk]; exists {
fillValueTV(store, &mli.Value)
val, ok = mli.Value, true
@@ -755,7 +757,7 @@ func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue
func (mv *MapValue) DeleteForKey(store Store, key *TypedValue) {
// if key is NaN, do nothing.
- kmk := key.ComputeMapKey(store, false)
+ kmk, nan := key.ComputeMapKey(store, false)
if mli, ok := mv.vmap[kmk]; ok {
mv.List.Remove(mli)
delete(mv.vmap, kmk)
@@ -990,6 +992,19 @@ func (tv *TypedValue) IsNilInterface() bool {
return false
}
+func (tv *TypedValue) IsNaN() bool {
+ switch tv.T.Kind() {
+ case Float32Kind:
+ _, _, _, _, nan := softfloat.Funpack32(tv.GetFloat32())
+ return nan
+ case Float64Kind:
+ _, _, _, _, nan := softfloat.Funpack64(tv.GetFloat64())
+ return nan
+ default:
+ return false
+ }
+}
+
func (tv *TypedValue) HasKind(k Kind) bool {
if tv.T == nil {
return false
@@ -1520,7 +1535,7 @@ func (tv *TypedValue) AssertNonNegative(msg string) {
}
}
-func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
+func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) (mkey MapKey, nan bool){
// Special case when nil: has no separator.
if tv.T == nil {
if debug {
@@ -1528,7 +1543,7 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
panic("should not happen")
}
}
- return MapKey(nilStr)
+ return MapKey(nilStr), false
}
// General case.
bz := make([]byte, 0, 64)
@@ -1538,6 +1553,9 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
}
switch bt := baseOf(tv.T).(type) {
case PrimitiveType:
+ if tv.IsNaN() {
+ return MapKey{}, true
+ }
pbz := tv.PrimitiveBytes()
bz = append(bz, pbz...)
case *PointerType:
@@ -1554,7 +1572,11 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
omitTypes := bt.Elem().Kind() != InterfaceKind
for i := range al {
ev := fillValueTV(store, &av.List[i])
- bz = append(bz, ev.ComputeMapKey(store, omitTypes)...)
+ mkey, nan := ev.ComputeMapKey(store, omitTypes)
+ if nan {
+ return MapKey{}, true
+ }
+ bz = append(bz, mkey...)
if i != al-1 {
bz = append(bz, ',')
}
@@ -1572,7 +1594,11 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
for i := range sl {
fv := fillValueTV(store, &sv.Fields[i])
omitTypes := bt.Fields[i].Type.Kind() != InterfaceKind
- bz = append(bz, fv.ComputeMapKey(store, omitTypes)...)
+ mkey, nan := fv.ComputeMapKey(store, omitTypes)
+ if nan {
+ return MapKey{}, true
+ }
+ bz = append(bz, mkey...)
if i != sl-1 {
bz = append(bz, ',')
}
@@ -1594,7 +1620,7 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
"unexpected map key type %s",
tv.T.String()))
}
- return MapKey(bz)
+ return MapKey(bz), false
}
// ----------------------------------------
@ltzmaxwell I updated some of the code you added recently (#4385) in light of this comment, let me know if it sounds good to you.
@jaekwon, this now returns a second parameter as you mentioned; please take a look.
I addressed maxwell's review, and updated the OP to match the changed method.
I suggest we merge #4327 first, and then go ahead and merge this - @ltzmaxwell, in the meantime, if you could have a second look and tell me if the changes are sensible ;)