opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

[PoC] Show how configoptional would work

Open mx-psi opened this issue 7 months ago • 13 comments

Description

Adds configoptional. Tests it on confighttp and otlpreceiver modules. Thanks @jade-guiton-dd for the feedback and help with designing this, and thanks to @yurishkuro for the initial version (see #10260)

Sanity check

Sanity check on the drawbacks we set out to fix on the RFC:

It may not be immediately obvious to a new user that a pointer type indicates a field is optional.

:heavy_check_mark: Yes, this is now explicit

The distinction between values that are conventionally pointers (e.g. gRPC configs) and optional values is lost.

:heavy_check_mark: Yes, we can differentiate those now

Setting a default value for a pointer field when decoding will set the field on the resulting config struct, and additional logic must be done to unset the default if the user has not specified a value.

:heavy_check_mark: Yes, you can see the OTLP receiver unmarshal method.

The potential for null pointer exceptions is created.

:x: I don't think there is much of an advantage here

Config structs are generally intended to be immutable and may be passed around a lot, which makes the mutability property of pointer fields an undesirable property.

:question: The only way I exposed is a Get() and GetOrInsertDefault() method, this is because it is inconvenient to call methods with pointer receivers if we don't. We could have a Value() method, that would prevent mutability but I haven't found a use case for this.

Cons of doing this

  1. You have a to put a lot of Gets to access optional fields. That's more annoying than Go automatically de-referencing things for you.
  2. GetOrInsertDefault is a long name and I don't like it.
  3. You can still call the configoptional.NewFactory locally inside a function and shoot yourself in the foot (or at least make your life inconvenient).

FAQ

Why not Default(t T) instead of Default(f Factory[T])?

To prevent people from passing values with pointers and accidentally sharing those pointers.

Why not Default(f func() T) instead of Default(f Factory[T])?

So that the function pointer is the same across instances and assert.Equal is happy. For example, so you can do:

f := otlpreceiver.NewFactory()

assert.Equal(t, f.CreateDefaultConfig(), f.CreateDefaultConfig())`

(or even with different factories!)

Okay, but why not Default(f *func() T) instead of Default(f Factory[T])?

This is invalid:


func foo() {}

var bar = &foo  // COMPILE TIME ERROR

This is valid


func foo() {}

func getPointer(f func()) *func() { return &f }

var bar = getPointer(foo)

Why store *DefaultFunc[T] instead of DefaultFunc[T]?

So that you can compare optional values easily. Per reflect.DeepEqual's doc,

Func values are deeply equal if both are nil; otherwise they are not deeply equal.

so we need to store the pointer instead to compare by address.

Why do you have both defaultFn and hasValue?

So that var none Optional[T] is equivalent to none := None[T].

Why expose the factory via GetOrInsertDefault()?

On internal/e2e you can see cases where you want to get the default outside of the component module. This allows you to get it without exposing extra API.

Why Get() *T instead of Value() T?

Apart from avoiding allocations, so that you can easily call methods with pointer receivers.

I don't like the names

Let's bikeshed after we decide if this is a good idea in the first place.

Link to tracking issue

Informs #12981

mx-psi avatar May 12 '25 15:05 mx-psi

--- FAIL: TestUnmarshalConfigOnlyHTTPEmptyMap (0.00s)
    /home/pablo.baeyens/Source/otel/opentelemetry-collector/receiver/otlpreceiver/config_test.go:82: 
        	Error Trace:	/home/pablo.baeyens/Source/otel/opentelemetry-collector/receiver/otlpreceiver/config_test.go:82
        	Error:      	Not equal: 
        	            	expected: configoptional.Optional[go.opentelemetry.io/collector/config/configgrpc.ServerConfig]{hasValue:false, value:configgrpc.ServerConfig{NetAddr:confignet.AddrConfig{Endpoint:"", Transport:"", DialerConfig:confignet.DialerConfig{Timeout:0, _:struct {}{}}, _:struct {}{}}, TLSSetting:(*configtls.ServerConfig)(nil), MaxRecvMsgSizeMiB:0, MaxConcurrentStreams:0x0, ReadBufferSize:0, WriteBufferSize:0, Keepalive:(*configgrpc.KeepaliveServerConfig)(nil), Auth:(*configauth.Config)(nil), IncludeMetadata:false, Middlewares:[]configmiddleware.Config(nil), _:struct {}{}}, defaultFn:(configoptional.DefaultFunc[go.opentelemetry.io/collector/config/configgrpc.ServerConfig])(0xdf5f40)}
        	            	actual  : configoptional.Optional[go.opentelemetry.io/collector/config/configgrpc.ServerConfig]{hasValue:false, value:configgrpc.ServerConfig{NetAddr:confignet.AddrConfig{Endpoint:"", Transport:"", DialerConfig:confignet.DialerConfig{Timeout:0, _:struct {}{}}, _:struct {}{}}, TLSSetting:(*configtls.ServerConfig)(nil), MaxRecvMsgSizeMiB:0, MaxConcurrentStreams:0x0, ReadBufferSize:0, WriteBufferSize:0, Keepalive:(*configgrpc.KeepaliveServerConfig)(nil), Auth:(*configauth.Config)(nil), IncludeMetadata:false, Middlewares:[]configmiddleware.Config(nil), _:struct {}{}}, defaultFn:(configoptional.DefaultFunc[go.opentelemetry.io/collector/config/configgrpc.ServerConfig])(0xdf5f40)}
        	            	
        	            	Diff:
        	Test:       	TestUnmarshalConfigOnlyHTTPEmptyMap
FAIL
FAIL	go.opentelemetry.io/collector/receiver/otlpreceiver	0.015s
FAIL

mx-psi avatar May 12 '25 15:05 mx-psi

Codecov Report

Attention: Patch coverage is 84.09091% with 14 lines in your changes missing coverage. Please review.

Project coverage is 91.50%. Comparing base (e9f3dec) to head (25b6c9e). Report is 111 commits behind head on main.

Files with missing lines Patch % Lines
config/configoptional/optional.go 61.76% 11 Missing and 2 partials :warning:
receiver/otlpreceiver/otlp.go 90.90% 0 Missing and 1 partial :warning:

:x: Your patch status has failed because the patch coverage (84.09%) is below the target coverage (95.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13018      +/-   ##
==========================================
- Coverage   91.53%   91.50%   -0.04%     
==========================================
  Files         504      505       +1     
  Lines       28154    28186      +32     
==========================================
+ Hits        25772    25791      +19     
- Misses       1873     1884      +11     
- Partials      509      511       +2     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar May 13 '25 13:05 codecov[bot]

cc @open-telemetry/collector-approvers @yurishkuro @mahadzaryab1 I am undecided on this.

Questions for y'all:

  1. Do you like this overall? (you can :+1: / :-1: this message, if :-1: it would be helpful to comment why as well)
  2. If we figure out good names, should we go ahead with this?
  3. Any suggestions to further simplify the API are welcome (see the FAQ for why the API works this way)

mx-psi avatar May 13 '25 16:05 mx-psi

This looks good to me. I do not mind all the Get() calls given how much simpler/safer this appears (i.e., the pointer removals).

jmacd avatar May 13 '25 16:05 jmacd

Really appreciate the FAQ, that answered a lot of questions I had when reviewing the code.

Overall I still like this, it feels more explicit than using pointers at what feels like a fairly minimal cost. The two biggest pieces of boilerplate to me feel acceptable:

  1. Instead of repeatedly calling Get, the result could be stored in a variable, which I think would slightly improve readability even compared to the pointer version.
  2. Instantiating the default value factories feels a little heavy, but I think it's the simplest solution given what you've outlined in the FAQ, and I like the way you created them in the OTLP receiver by decomposing the config.

I also like how this cleans up the Unmarshal method on the OTLP receiver's config: users who want to implement the style of configuration used in the protocols section no longer need to do manually manipulate the *confmap.Conf object, but can just use a type that will handle that functionality for them.

I'm not too worried about the first two cons you outlined. I'm not sure how to address the third one, but it doesn't feel like a major negative to me. The biggest negative to me is the introduction of new APIs that, while optional, component authors will likely need to learn if they use our config structs as inspiration for their own.

Overall, I don't feel strongly that we need this, but it seems to have more benefits than detriments.

evan-bradley avatar May 13 '25 22:05 evan-bradley

I think the fact that it eliminates type-unsafe lookups in conf.Map by property name is a very significant win.

I'd prefer a shorter pkg name, ideally same config.Optional, but at least xconfig.Optional.

Factory seems overkill. Not sure how it works even, e.g. calling this twice:

func NewFactory[T any](defaultFn DefaultFunc[T]) Factory[T] {
	return Factory[T]{defaultFn: &defaultFn}
}

with some function would create a local variable with function pointer and then cause that variable to escape to heap in order to take its address. The result is that two factories would not compare equal, meaning the factory needs to be a static var. If external var is the only way then isn't there a smaller API to achieve that?

var makeAbc DefaultFunc[Abc] = func() Abc { ... }

func DefaultConfig() Config {
  return Config {
    ABC xoptional.Default(&makeAbc)
  }
}

yurishkuro avatar May 14 '25 03:05 yurishkuro

GetOrInsertDefault

if this is only used for some form of e2e tests (surprising that it's needed) then perhaps it can be a static function named accordingly, not a part of the struct's API.

yurishkuro avatar May 14 '25 03:05 yurishkuro

@yurishkuro Regarding the Factories: my idea was that requiring the additional step of calling NewFactory would encourage users to read the docs for the function, so they know that it should be called only once, and the result should be placed in a static variable. The fact that the type is opaque may also encourage the idea that it shouldn't be recreated constantly. (The goal being that Optional values from two calls to DefaultConfig are comparable in tests.)

Without an explicit function call in between, I'm worried that users will only look at the type signature of Default and do this:

func createDefaultThing() Thing { ... }
func DefaultConfig() Config {
  var thingFactory := createDefaultThing
  return Config {
    ThingField Default(&thingFactory)
  }
}

which I believe would allocate a new *func() Thing on every call to DefaultConfig, breaking the comparability of Config.

I didn't know you could take a pointer to a var f func(), despite not being able to take a pointer to a func f()... It seems somewhat cleaner and more efficient (after a quick test with -gcflags '-S' it looks like no heap allocation is performed). How about a hybrid approach like the following?

/// optional.go
type Factory[T any] struct {
	fn func() T // no longer a pointer
}
// Make sure to call this once and store the result in a static! Do it for the tests
func NewFactory[T any](defaultFn func() T) Factory[T] {
	return Factory[T]{fn: defaultFn}
}
type Optional[T any] struct {
	// [...]
	defaultFactory *Factory[T]
}
func Default[T any](factory *Factory[T]) Optional[T] { // takes a pointer to a Factory now
	return Optional[T]{defaultFactory: factory}
}
func (o *Optional[T]) GetOrInsertDefault() *T {
	if !o.HasValue() && o.defaultFactory != nil {
		o.value = o.defaultFactory.fn()
		o.defaultFactory = nil
	}
	// [...]
}

/// user code
func createDefaultThing() Thing { ... }
var thingFactory = NewFactory(createDefaultThing)
func DefaultConfig() Config {
  return Config {
    ThingField Default(&thingFactory)
  }
}

Based on my tests it seems as efficient as your proposal, but I think the required function call may make it easier to enforce the intended use.

jade-guiton-dd avatar May 14 '25 10:05 jade-guiton-dd

GetOrInsertDefault

if this is only used for some form of e2e tests (surprising that it's needed) then perhaps it can be a static function named accordingly, not a part of the struct's API.

@yurishkuro It is also used in the OTLP receiver tests (14 times). Components that wrap the OTLP receiver may also need to use it. I think it's worth having in the struct's API.

mx-psi avatar May 14 '25 11:05 mx-psi

@jade-guiton-dd accepting a function pointer is already a forcing function to read the docs since it is an unusual design and you cannot just take an address of a function, you need to store it in a variable first.

yurishkuro avatar May 14 '25 13:05 yurishkuro

@mx-psi if it's almost always used in tests I'd keep it out of the struct. You get the same functionality, but don't pollute the main api with testing functions.

yurishkuro avatar May 14 '25 13:05 yurishkuro

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions[bot] avatar May 29 '25 03:05 github-actions[bot]

This PR was marked stale due to lack of activity. It will be closed in 14 days.

github-actions[bot] avatar Jun 14 '25 03:06 github-actions[bot]

Closed as inactive. Feel free to reopen if this PR is still being worked on.

github-actions[bot] avatar Jun 29 '25 03:06 github-actions[bot]