srtp icon indicating copy to clipboard operation
srtp copied to clipboard

fix: struct padding for 8 bytes alignment

Open Antonito opened this issue 2 years ago • 4 comments

This PR fixes the memory layout of some structs to reduce their size (aligned for 8 bytes, which should be most platforms).

  • srtpSSRCState goes from 40 bytes to 32 bytes
  • ReadStreamSRTCP from 56 to 48
  • ReadStreamSRTP from 56 to 48

Those are small improvements, but should help in large scale scenarios.

I noticed this is an optimization that could be made across most pion repositories, if that's something worth doing @Sean-Der ?


If someone wants to go through this process, here's how I did it:

  • Set the Go version as 1.18 in go.mod
  • Add this file memory_layout_helpers.go to the package:
package srtp

import (
	"fmt"
	"reflect"
	"runtime"
)

// BenchmarkAllocation allocates `amount` elements of type T, and display the memory usage.
//
// For ease of optimization, each member of the element T is printed, with its offset, padding & alignment.
func BenchmarkAllocation[T any](amount int) {
	typ := reflect.TypeOf(ZeroValue[T]())

	fmt.Printf("Struct is %d bytes long\n", typ.Size())

	n := typ.NumField()
	totalMinimumSize := uintptr(0)

	for i := 0; i < n; i++ {
		field := typ.Field(i)
		fmt.Printf("%-40s at offset %-8d | size = %-3d | align = %d\n",
			field.Name, field.Offset, field.Type.Size(),
			field.Type.Align())

		totalMinimumSize += field.Type.Size()
	}

	if typ.Size() != totalMinimumSize {
		fmt.Printf("with no padding, struct could be %d bytes long (%d bytes gain)\n", totalMinimumSize, typ.Size()-totalMinimumSize)
	}

	var allStats []T
	for i := 0; i < amount; i++ {
		allStats = append(allStats, ZeroValue[T]())
	}

	PrintMemoryUsage()
}

func PrintMemoryUsage() {
	var m runtime.MemStats
	runtime.ReadMemStats(&m)

	fmt.Printf("Alloc = %v MiB", bytesToMegabytes(m.Alloc))
	fmt.Printf("\tTotalAlloc = %v MiB", bytesToMegabytes(m.TotalAlloc))
	fmt.Printf("\tSys = %v MiB", bytesToMegabytes(m.Sys))
	fmt.Printf("\tNumGC = %v\n", m.NumGC)
}

func bytesToMegabytes(b uint64) uint64 {
	return b / 1024 / 1024
}

func ZeroValue[T any]() (value T) {
	return
}
  • Add memory_layout_test.go to the package:
// It is important to be in package `srtp` and not `srtp_test` to
// test internal structs.
package srtp

import (
	"testing"
)

func Test_StructMemoryLayout(t *testing.T) {
	BenchmarkAllocation[ReadStreamSRTP](1_000_000)
}
  • Run the test!

Antonito avatar Jun 14 '22 16:06 Antonito

@Antonito - can you quantify the improvement here in typical usage? (speed, memory usage, etc.)

My concern with this kind of thing is the possibility that some dependency may rely on the internal structure or ordering of structs. For example maybe there's a project out there using Pion that sends the structs directly over the wire without any serialisation. This PR would break that if different versions of Pion are used at each end of the wire, and it is quite hard to determine if there are any such usages out there. So, these changes are not risk-free, therefore, knowing the scale of the benefit is important in making a go/no-go decision.

Edit to add: @Sean-Der's philosophy, I think, is not to "fix" questionable behaviour in a point release, if there is a chance someone might be relying on it (maybe without even knowing it). I recently opened a PR on pion/logging because its default (output to stdout instead of stderr) is inconsistent with Go's default logging, and could cause 'log leakage' into a binary data stream on stdout. But we don't know if someone out there is relying on pion/logging's default going to stdout, so it's difficult to know the impact of a fix. Therefore, it's pushed to v4.0.0.

adriancable avatar Jun 14 '22 16:06 adriancable

Codecov Report

Patch and project coverage have no change.

Comparison is base (59555e4) 76.50% compared to head (2ab7b27) 76.50%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #209   +/-   ##
=======================================
  Coverage   76.50%   76.50%           
=======================================
  Files          17       17           
  Lines        1243     1243           
=======================================
  Hits          951      951           
  Misses        200      200           
  Partials       92       92           
Flag Coverage Δ
go 76.50% <ø> (ø)
wasm 76.02% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
context.go 88.61% <ø> (ø)
stream_srtcp.go 67.46% <ø> (ø)
stream_srtp.go 74.07% <ø> (ø)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

codecov[bot] avatar Jun 14 '22 16:06 codecov[bot]

Thanks for the detailed answer @adriancable !

I definitely understand your concerns, and to be quite frank had the same ones. Here's what I concluded (feel free to push back if I'm mistaken / missed something!):

  • srtpSSRCState is an internal state and holds a pointer (replaydetector.ReplayDetector), it can't be sent as-is over-the-wire.
  • ReadStreamSRTCP and ReadStreamSRTP are public, however they both contain pointers (*SessionSRTCP, chan bool) and mutexes, so they can't been sent without any serialization over-the-wire as well.

Regarding improvements, this should simply reduce slightly memory usage on 64 bits platforms, and should not impact performances . It would mostly benefit SFUs with a large number of simultaneous connections (100k+).

If you want to push back that kind of changes to v4.0.0, it's also fine by me!

Antonito avatar Jun 14 '22 16:06 Antonito

We're saving 8 bytes of memory per stream, maybe 24. If you have 100K+ connections, I would hope that you would not be running down to the last 800K - 2.4MB of RAM on the SFU!

Yes - every little helps - a little! And so I don't want this change to get forgotten, even if we don't merge it now. I think I need some second opinions on the right timing for this before pushing the button. @Sean-Der

adriancable avatar Jun 14 '22 23:06 adriancable

I would argue that internal structure layout is not part of the public API. Which APIs would be used to access the raw memory of a struct? I assume this must be unsafe? If it is, I would not respect that..

The only valid case, I could imagine is encoding/gob. However, according to the docs, ordering of fields does not matter for encoding/gob.

stv0g avatar May 05 '23 14:05 stv0g

@stv0g agree! Just missed this PR originally

we should merge this and enable to struct align lint INO

Sean-Der avatar May 05 '23 15:05 Sean-Der