gno icon indicating copy to clipboard operation
gno copied to clipboard

fix(gnovm): allow multiple NaN keys in map, alternative

Open thehowl opened this issue 8 months ago • 7 comments

This PR fixes #3867. It's an alternative to #3932.

  • The method TypedValue.PrimitiveBytes is modified to PrimitiveMapKey and made more efficient using appends; this rename is done to make explicit that there are caveats used for map key computation.
  • Floating point values which are NaN are instead encoded as a 7-byte increasing counter; given that ComputeMapKey is used to calculate the key for vmap, which is an unexported field which is not decoded from an existing map, it is existing to have this counter that is re-initialized at each startup (similarly to how we use uintptr's to compare pointers in maps, which would be non-determinstic but is safe in this context).
  • This guarantees that two map keys with NaN's may only be equal after cycling through $2^{56}$ values, which I don't see as likely.
  • Negative 0 is now also normalized to 0, matching go's behaviour.
  • Map keys are now re-assigned when using GetPointerForKey on an existing key, in order to match Go's behaviour.
  • Like 3932, it doesn't add further complexity to map handling in all other cases.
  • Unlike 3932, it detects NaN using the softfloat package, rather than using hardware FP instructions, ensuring cross-architecture determinism.

Context (from this comment):

I made this as an alternative to [marc's] PR.

A few reasons:

  • This still uses randomness, which is going to be non-deterministic across nodes and will create problems in the long run
  • Also given that the set of possible nan values for a float 32 is ~8 million, which is very small.

Some nodes will hit the random limit condition sooner than the others. I don't think it's safe to have this.

I thought a second about how we can implement this; I think it can be done without having to change MapKey; just changing when we assign to vmap; this way, there is only an additional function call + if for map assignments and for map deletions, while iteration and access doesn't require it.

IMO, my implementation is simpler and more straightforward.

There is one problem where I think you can have good ideas / insights: https://github.com/gnolang/gno/pull/4114#issuecomment-2797529537 - it looks like for negative zero, behaviour is different between ours vs go's impl. Can you take a look?

thehowl avatar Apr 11 '25 16:04 thehowl

🛠 PR Checks Summary

All Automated Checks passed. ✅

Manual Checks (for Reviewers):
  • [ ] IGNORE the bot requirements for this PR (force green CI check)
Read More

🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers.

✅ Automated Checks (for Contributors):

🟢 Maintainers must be able to edit this pull request (more info)

☑️ Contributor Actions:
  1. Fix any issues flagged by automated checks.
  2. Follow the Contributor Checklist to ensure your PR is ready for review.
    • Add new tests, or document why they are unnecessary.
    • Provide clear examples/screenshots, if necessary.
    • Update documentation, if required.
    • Ensure no breaking changes, or include BREAKING CHANGE notes.
    • Link related issues/PRs, where applicable.
☑️ Reviewer Actions:
  1. Complete manual checks for the PR, including the guidelines and additional checks if applicable.
📚 Resources:
Debug
Automated Checks
Maintainers must be able to edit this pull request (more info)

If

🟢 Condition met
└── 🟢 And
    ├── 🟢 The base branch matches this pattern: ^master$
    └── 🟢 The pull request was created from a fork (head branch repo: thehowl/gno)

Then

🟢 Requirement satisfied
└── 🟢 Maintainer can modify this pull request

Manual Checks
**IGNORE** the bot requirements for this PR (force green CI check)

If

🟢 Condition met
└── 🟢 On every pull request

Can be checked by

  • Any user with comment edit permission

Gno2D2 avatar Apr 11 '25 16:04 Gno2D2

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests.

:loudspeaker: Thoughts on this report? Let us know!

codecov[bot] avatar Apr 11 '25 16:04 codecov[bot]

There is a possible problem: this behaves differently in Go and Gno...

// You can edit this code!
// Click here and start typing.
package main

import (
	"fmt"
	"math"
)

func main() {
	n0 := math.Float64frombits(1 << 63)
	fmt.Println(n0)
	m := map[float64]string{}
	m[0] = "hey"
	m[n0] = "world"
	fmt.Println(m)
}

// Output:
// -0
// map[0:world]

In Go, the key that is used is actually the last: https://go.dev/play/p/K--afH1wh0L -- so the result is map[-0:world].

thehowl avatar Apr 11 '25 17:04 thehowl

ALSO realised this doesn't work for ie. a struct with a float... we probably still need some modification to ComputeMapKey

thehowl avatar Apr 11 '25 17:04 thehowl

This is now ready for review - I also fixed the other mentioned issues relating to maps.

thehowl avatar Apr 15 '25 11:04 thehowl

Since you can't delete a NaN key I presume, couldn't we solve this with minimal changes? There's already a List, so the insertion order is what would make it deterministic?

like

diff --git a/gnovm/pkg/gnolang/values.go b/gnovm/pkg/gnolang/values.go
index 04c0ce916..a4fc3b529 100644
--- a/gnovm/pkg/gnolang/values.go
+++ b/gnovm/pkg/gnolang/values.go
@@ -723,6 +723,7 @@ func (mv *MapValue) GetLength() int {
 // GetPointerForKey is only used for assignment, so the key
 // is not returned as part of the pointer, and TV is not filled.
 func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedValue) PointerValue {
+       // If NaN, instead of computing map key, just append to List.
        kmk := key.ComputeMapKey(store, false)
        if mli, ok := mv.vmap[kmk]; ok {
                return PointerValue{
@@ -743,6 +744,7 @@ func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedVa
 // Like GetPointerForKey, but does not create a slot if key
 // doesn't exist.
 func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue, ok bool) {
+       // If key is NaN, return default
        kmk := key.ComputeMapKey(store, false)
        if mli, exists := mv.vmap[kmk]; exists {
                fillValueTV(store, &mli.Value)
@@ -752,6 +754,7 @@ func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue
 }
 
 func (mv *MapValue) DeleteForKey(store Store, key *TypedValue) {
+       // if key is NaN, do nothing.
        kmk := key.ComputeMapKey(store, false)
        if mli, ok := mv.vmap[kmk]; ok {
                mv.List.Remove(mli)

jaekwon avatar Jun 16 '25 13:06 jaekwon

I replied to this review on Signal a while back, cross-posting here:

image

Text-based version, for searchability

The idea behind the PR was to follow in the footsteps of what Marc did, which I think largely makes sense

And that is not to change the code path, ie. adding conditionals or slowing down code anyway, for the 99.9999% of cases that don't have NaNs

Also keep in mind any solution has to work not just for map[float64]V, but also for map[struct{k float64}]V

So you need a "deep" nan check for those cases, which is potentially expensive for large structs. You can make this cheaper by returning a second param in ComputeMapKey, but again, this makes code more complex to account for the 1/1000000th case where there is a nan

because then you need to add a check for the second param each time

thehowl avatar Jun 24 '25 18:06 thehowl

// If NaN, instead of computing map key, just append to List.

Thanks, that makes sense. Having ComputeMapKey return a second bool isn't bad, see below: The benefit of this is that it's clear that anything with NaN doesn't have a map key. It's slightly slower than otherwise, but map operations are slow anyways so it's negligible. And your optimization of passing in bz can still be utilized, but would rather have it on top of the following...

(I like your solution, but really, the slowdown here is neglibible compared to map operations already! so why not go for the more perfect solution without mental overhead? imagine somebody having to understand the nanCounter vs seeing ComputeMapKey return false.)

diff --git a/gnovm/pkg/gnolang/realm.go b/gnovm/pkg/gnolang/realm.go
index 4f1626a66..71429c726 100644
--- a/gnovm/pkg/gnolang/realm.go
+++ b/gnovm/pkg/gnolang/realm.go
@@ -1434,8 +1434,10 @@ func fillTypesOfValue(store Store, val Value) Value {
 		for cur := cv.List.Head; cur != nil; cur = cur.Next {
 			fillTypesTV(store, &cur.Key)
 			fillTypesTV(store, &cur.Value)
-
-			cv.vmap[cur.Key.ComputeMapKey(store, false)] = cur
+			mkey, nan := cur.Key.ComputeMapKey(store, false)
+			if !nan {
+				cv.vmap[mkey] = cur
+			}
 		}
 		return cv
 	case TypeValue:
diff --git a/gnovm/pkg/gnolang/values.go b/gnovm/pkg/gnolang/values.go
index a4fc3b529..73be21ceb 100644
--- a/gnovm/pkg/gnolang/values.go
+++ b/gnovm/pkg/gnolang/values.go
@@ -724,7 +724,8 @@ func (mv *MapValue) GetLength() int {
 // is not returned as part of the pointer, and TV is not filled.
 func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedValue) PointerValue {
 	// If NaN, instead of computing map key, just append to List.
-	kmk := key.ComputeMapKey(store, false)
+	kmk, nan := key.ComputeMapKey(store, false)
+	if !nan {
 	if mli, ok := mv.vmap[kmk]; ok {
 		return PointerValue{
 			TV:    fillValueTV(store, &mli.Value),
@@ -732,6 +733,7 @@ func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedVa
 			Index: PointerIndexMap,
 		}
 	}
+	}
 	mli := mv.List.Append(alloc, *key)
 	mv.vmap[kmk] = mli
 	return PointerValue{
@@ -745,7 +747,7 @@ func (mv *MapValue) GetPointerForKey(alloc *Allocator, store Store, key *TypedVa
 // doesn't exist.
 func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue, ok bool) {
 	// If key is NaN, return default
-	kmk := key.ComputeMapKey(store, false)
+	kmk, nan := key.ComputeMapKey(store, false)
 	if mli, exists := mv.vmap[kmk]; exists {
 		fillValueTV(store, &mli.Value)
 		val, ok = mli.Value, true
@@ -755,7 +757,7 @@ func (mv *MapValue) GetValueForKey(store Store, key *TypedValue) (val TypedValue
 
 func (mv *MapValue) DeleteForKey(store Store, key *TypedValue) {
 	// if key is NaN, do nothing.
-	kmk := key.ComputeMapKey(store, false)
+	kmk, nan := key.ComputeMapKey(store, false)
 	if mli, ok := mv.vmap[kmk]; ok {
 		mv.List.Remove(mli)
 		delete(mv.vmap, kmk)
@@ -990,6 +992,19 @@ func (tv *TypedValue) IsNilInterface() bool {
 	return false
 }
 
+func (tv *TypedValue) IsNaN() bool {
+	switch tv.T.Kind() {
+	case Float32Kind:
+		_, _, _, _, nan := softfloat.Funpack32(tv.GetFloat32())
+		return nan
+	case Float64Kind:
+		_, _, _, _, nan := softfloat.Funpack64(tv.GetFloat64())
+		return nan
+	default:
+		return false
+	}
+}
+
 func (tv *TypedValue) HasKind(k Kind) bool {
 	if tv.T == nil {
 		return false
@@ -1520,7 +1535,7 @@ func (tv *TypedValue) AssertNonNegative(msg string) {
 	}
 }
 
-func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
+func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) (mkey MapKey, nan bool){
 	// Special case when nil: has no separator.
 	if tv.T == nil {
 		if debug {
@@ -1528,7 +1543,7 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
 				panic("should not happen")
 			}
 		}
-		return MapKey(nilStr)
+		return MapKey(nilStr), false
 	}
 	// General case.
 	bz := make([]byte, 0, 64)
@@ -1538,6 +1553,9 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
 	}
 	switch bt := baseOf(tv.T).(type) {
 	case PrimitiveType:
+		if tv.IsNaN() {
+			return MapKey{}, true
+		}
 		pbz := tv.PrimitiveBytes()
 		bz = append(bz, pbz...)
 	case *PointerType:
@@ -1554,7 +1572,11 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
 			omitTypes := bt.Elem().Kind() != InterfaceKind
 			for i := range al {
 				ev := fillValueTV(store, &av.List[i])
-				bz = append(bz, ev.ComputeMapKey(store, omitTypes)...)
+				mkey, nan := ev.ComputeMapKey(store, omitTypes)
+				if nan {
+					return MapKey{}, true
+				}
+				bz = append(bz, mkey...)
 				if i != al-1 {
 					bz = append(bz, ',')
 				}
@@ -1572,7 +1594,11 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
 		for i := range sl {
 			fv := fillValueTV(store, &sv.Fields[i])
 			omitTypes := bt.Fields[i].Type.Kind() != InterfaceKind
-			bz = append(bz, fv.ComputeMapKey(store, omitTypes)...)
+			mkey, nan := fv.ComputeMapKey(store, omitTypes)
+			if nan {
+				return MapKey{}, true
+			}
+			bz = append(bz, mkey...)
 			if i != sl-1 {
 				bz = append(bz, ',')
 			}
@@ -1594,7 +1620,7 @@ func (tv *TypedValue) ComputeMapKey(store Store, omitType bool) MapKey {
 			"unexpected map key type %s",
 			tv.T.String()))
 	}
-	return MapKey(bz)
+	return MapKey(bz), false
 }
 
 // ----------------------------------------

jaekwon avatar Jul 07 '25 00:07 jaekwon

@ltzmaxwell I updated some of the code you added recently (#4385) in light of this comment, let me know if it sounds good to you.

@jaekwon, this now returns a second parameter as you mentioned; please take a look.

thehowl avatar Jul 22 '25 09:07 thehowl

I addressed maxwell's review, and updated the OP to match the changed method.

I suggest we merge #4327 first, and then go ahead and merge this - @ltzmaxwell, in the meantime, if you could have a second look and tell me if the changes are sensible ;)

thehowl avatar Aug 11 '25 10:08 thehowl