go icon indicating copy to clipboard operation
go copied to clipboard

crypto/internal/nistec: incorrect number of loops in p256SelectAffine of s390x and ppe64le asm

Open emmansun opened this issue 1 year ago • 5 comments

Go version

Go1.19+

Output of go env in your module/workspace:

NA

What did you do?

according the precomputed p256AffineTable and the definition of function p256SelectAffine:

// p256AffineTable is a table of the first 32 multiples of a point. Points are
// stored at an index offset of -1 like in p256Table, and [0]P is not stored.
type p256AffineTable [32]p256AffinePoint

// p256Precomputed is a series of precomputed multiples of G, the canonical
// generator. The first p256AffineTable contains multiples of G. The second one
// multiples of [2⁶]G, the third one of [2¹²]G, and so on, where each successive
// table is the previous table doubled six times. Six is the width of the
// sliding window used in p256ScalarMult, and having each table already
// pre-doubled lets us avoid the doublings between windows entirely. This table
// MUST NOT be modified, as it aliases into p256PrecomputedEmbed below.
var p256Precomputed *[43]p256AffineTable

// p256SelectAffine sets res to the point at index idx in the table.
// idx must be in [0, 31]. It executes in constant time.
//
//go:noescape
func p256SelectAffine(res *p256AffinePoint, table *p256AffineTable, idx int)

The correct number of loops in p256SelectAffine should be 32 if takes one point at a time.

What did you see happen?

  1. https://github.com/golang/go/blob/36b45bca66d86020f0b4daf1f15b02921a8dad43/src/crypto/internal/nistec/p256_asm_s390x.s#L511
  2. https://github.com/golang/go/blob/36b45bca66d86020f0b4daf1f15b02921a8dad43/src/crypto/internal/nistec/p256_asm_ppc64le.s#L444
  3. https://github.com/golang/go/blob/36b45bca66d86020f0b4daf1f15b02921a8dad43/src/crypto/internal/nistec/p256_asm_ppc64le.s#L294

This will impact ScalarBaseMult performance.

What did you expect to see?

  1. https://github.com/golang/go/blob/36b45bca66d86020f0b4daf1f15b02921a8dad43/src/crypto/internal/nistec/p256_asm_s390x.s#L511 should be 33.
  2. https://github.com/golang/go/blob/36b45bca66d86020f0b4daf1f15b02921a8dad43/src/crypto/internal/nistec/p256_asm_ppc64le.s#L444 should be 32.
  3. https://github.com/golang/go/blob/36b45bca66d86020f0b4daf1f15b02921a8dad43/src/crypto/internal/nistec/p256_asm_ppc64le.s#L294 should be 16.

BTW, for s390x asm implementations:

  1. Try to use VLM/VSTM;
  2. https://github.com/golang/go/blob/36b45bca66d86020f0b4daf1f15b02921a8dad43/src/crypto/internal/nistec/p256_asm_s390x.s#L579 Can use VLEF directly since Go 1.19+.

emmansun avatar Aug 27 '24 08:08 emmansun

cc @laboger . BTW, it seems p256NegCond of ppc64le will operate in variable-time.

emmansun avatar Aug 27 '24 09:08 emmansun

ccing @pmur who is the current maintainer of Go on ppc64le as Lynn recently retired.

archanaravindar avatar Aug 27 '24 12:08 archanaravindar

The ppc64 code does look quietly wrong. It was derived from the s390 implementation. I would have expected sporadic segfaults as this would imply its reading 2KB beyond the limits of table, but never using such values.

pmur avatar Aug 27 '24 16:08 pmur

cc @golang/s390x @golang/ppc64 @golang/security

prattmic avatar Aug 27 '24 19:08 prattmic

Change https://go.dev/cl/608816 mentions this issue: crypto/internal/nistec: fix p256Select and p256SelectAffine on PPC64

gopherbot avatar Aug 27 '24 20:08 gopherbot

cc @laboger . BTW, it seems p256NegCond of ppc64le will operate in variable-time.

https://github.com/golang/go/issues/71383

emmansun avatar Jan 23 '25 03:01 emmansun

Thank you @emmansun, as you rightly pointed out, p256NegCond in ppc64le was using conditional branching (BC instruction) which can indeed lead to variable execution time, for various reasons which could be potentially exploited in timing attacks as @rolandshoemaker pointed out here #71383. Thanks once again @rolandshoemaker for the bug fix (https://go-review.googlesource.com/c/go/+/643735), which is a constant-time implementation as It eliminates conditional branching, uses vector select (VSEL), which achieves constant-time behaviour, eliminating timing variations that can leak information about the operations being performed. BTW @laboger has retired and as of now I am part of looking through golang/ppc64. Thanks @emmansun, once again for persistently looking through these.

jkrishmys avatar Jan 23 '25 08:01 jkrishmys