gnark icon indicating copy to clipboard operation
gnark copied to clipboard

Fix logic mismatch in `newR1C` for better Groth16 optimization

Open tomasandroil opened this issue 8 months ago • 1 comments

Previously, the comment suggested that the goal was to minimize the number of variables in the B matrix, but the implementation did the opposite — swapping L and R when L was longer. This PR corrects the condition to match the intended optimization logic, aligning with Groth16 performance best practices.

Fixes # (no issue number yet)


Type of change

  • [x] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] This change requires a documentation update

How has this been tested?

  • [x] Manual inspection of Groth16 output matrices
  • [ ] Unit test (TBD - logic is straightforward but may benefit from regression test)

tomasandroil avatar Apr 22 '25 12:04 tomasandroil

@ThomasPiellard can you check that one? I recall empirically, the code as is (before this PR) actually produce better perf

gbotrel avatar Apr 22 '25 13:04 gbotrel

Quick heads-up on PR #1482 (“fix(update): Correct logic mismatch in newR1C for better Groth16 optimization”):

  • Align the swap condition in newR1C with the intended goal of minimizing B-matrix variables
  • Previously swapped L and R when L was longer; now only swaps when that actually reduces variable count
  • Matches Groth16 performance best practices

PTAL @ThomasPiellard—let me know if empirical tests suggest otherwise! ⚙️

tomasandroil avatar Aug 05 '25 19:08 tomasandroil

@gbotrel, @ThomasPiellard, @tomasandroil

I implemented following benchmark (in groth recursion package):

func BenchmarkBN254Commitment(b *testing.B) {
	assert := test.NewAssert(b)
	innerCcs, innerVK, innerWitness, innerProof := getInnerCommitment(assert, ecc.BN254.ScalarField(), ecc.BN254.ScalarField())
	assert.Equal(len(innerCcs.GetCommitments().CommitmentIndexes()), 1)

	// outer proof
	circuitVk, err := ValueOfVerifyingKey[sw_bn254.G1Affine, sw_bn254.G2Affine, sw_bn254.GTEl](innerVK)
	assert.NoError(err)
	circuitWitness, err := ValueOfWitness[sw_bn254.ScalarField](innerWitness)
	assert.NoError(err)
	circuitProof, err := ValueOfProof[sw_bn254.G1Affine, sw_bn254.G2Affine](innerProof)
	assert.NoError(err)

	outerCircuit := &OuterCircuit[sw_bn254.ScalarField, sw_bn254.G1Affine, sw_bn254.G2Affine, sw_bn254.GTEl]{
		Proof:        PlaceholderProof[sw_bn254.G1Affine, sw_bn254.G2Affine](innerCcs),
		InnerWitness: PlaceholderWitness[sw_bn254.ScalarField](innerCcs),
		VerifyingKey: PlaceholderVerifyingKey[sw_bn254.G1Affine, sw_bn254.G2Affine, sw_bn254.GTEl](innerCcs),
	}
	outerAssignment := &OuterCircuit[sw_bn254.ScalarField, sw_bn254.G1Affine, sw_bn254.G2Affine, sw_bn254.GTEl]{
		InnerWitness: circuitWitness,
		Proof:        circuitProof,
		VerifyingKey: circuitVk,
	}

	r1cs, err := frontend.Compile(ecc.BN254.ScalarField(), r1cs.NewBuilder, outerCircuit)
	assert.NoError(err)
	witness, err := frontend.NewWitness(outerAssignment, ecc.BN254.ScalarField())
	assert.NoError(err)
	pk, err := groth16.DummySetup(r1cs)
	assert.NoError(err)
	for b.Loop() {
		_, err := groth16.Prove(r1cs, pk, witness, GetNativeProverOptions(ecc.BN254.ScalarField(), ecc.BN254.ScalarField()))
		assert.NoError(err)
	}
}

Ran the benchmarks:

go test -benchmem -run=^$ -bench ^BenchmarkBN254Commitment$ github.com/consensys/gnark/std/recursion/groth16 -count=2 -benchtime=3x

Then the results are:

  • old
goos: linux
goarch: amd64
pkg: github.com/consensys/gnark/std/recursion/groth16
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics     
BenchmarkBN254Commitment-16    	       3	4439002251 ns/op	2403417824 B/op	10357157 allocs/op
BenchmarkBN254Commitment-16    	       3	4393044997 ns/op	2397711322 B/op	10354902 allocs/op
PASS
ok  	github.com/consensys/gnark/std/recursion/groth16	111.366s
  • new
goos: linux
goarch: amd64
pkg: github.com/consensys/gnark/std/recursion/groth16
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics     
BenchmarkBN254Commitment-16    	       3	4069276560 ns/op	2141616477 B/op	10352891 allocs/op
BenchmarkBN254Commitment-16    	       3	4073187323 ns/op	2147236154 B/op	10356136 allocs/op
PASS
ok  	github.com/consensys/gnark/std/recursion/groth16	110.955s

And diff (no confidence as low number of runs, but we see the picture):

goos: linux
goarch: amd64
pkg: github.com/consensys/gnark/std/recursion/groth16
cpu: AMD Ryzen 9 7940HS w/ Radeon 780M Graphics     
                   │   old.txt   │            new.txt             │
                   │   sec/op    │   sec/op     vs base           │
BN254Commitment-16   4.416 ± ∞ ¹   4.071 ± ∞ ¹  ~ (p=0.333 n=2) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                   │    old.txt    │             new.txt              │
                   │     B/op      │     B/op       vs base           │
BN254Commitment-16   2.236Gi ± ∞ ¹   1.997Gi ± ∞ ¹  ~ (p=0.333 n=2) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                   │   old.txt    │             new.txt             │
                   │  allocs/op   │  allocs/op    vs base           │
BN254Commitment-16   10.36M ± ∞ ¹   10.35M ± ∞ ¹  ~ (p=0.667 n=2) ²
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

TLDR: I think it is good to merge. I'll only push a few cosmetic changes (documentation alignment).

ivokub avatar Sep 17 '25 11:09 ivokub