etcd icon indicating copy to clipboard operation
etcd copied to clipboard

embed: add `GRPCAdditionalServerOptions` config

Open rleungx opened this issue 2 years ago • 27 comments

This PR introduces GRPCAdditionalServerOptions which allow changing the internal gRPC settings. Sometimes, we may register our own gRPC service into etcd and change the max-request-bytes might affect the internal etcd logic.

rleungx avatar May 23 '22 10:05 rleungx

Codecov Report

Merging #14066 (5d6a29d) into main (a1405e9) will decrease coverage by 0.33%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main   #14066      +/-   ##
==========================================
- Coverage   75.60%   75.27%   -0.34%     
==========================================
  Files         457      457              
  Lines       37084    37117      +33     
==========================================
- Hits        28038    27939      -99     
- Misses       7312     7414     +102     
- Partials     1734     1764      +30     
Flag Coverage Δ
all 75.27% <100.00%> (-0.34%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/embed/config.go 73.40% <ø> (ø)
server/embed/etcd.go 75.38% <100.00%> (+0.04%) :arrow_up:
server/etcdserver/api/membership/store.go 50.00% <0.00%> (-10.00%) :arrow_down:
server/storage/mvcc/watchable_store.go 84.42% <0.00%> (-8.34%) :arrow_down:
client/pkg/v3/fileutil/purge.go 66.03% <0.00%> (-7.55%) :arrow_down:
api/v3rpc/rpctypes/error.go 84.61% <0.00%> (-5.87%) :arrow_down:
client/v3/experimental/recipes/key.go 75.34% <0.00%> (-5.48%) :arrow_down:
server/lease/lease.go 94.87% <0.00%> (-5.13%) :arrow_down:
raft/rafttest/node.go 95.00% <0.00%> (-5.00%) :arrow_down:
client/v3/leasing/cache.go 87.77% <0.00%> (-3.89%) :arrow_down:
... and 19 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar May 24 '22 04:05 codecov-commenter

Comments:

  1. What do you mean by Sometimes, we may register our own gRPC service into etcd? Could you explain this?
  2. Why you do not expose the two items? i.e. config.go#L132

ahrtr avatar May 25 '22 06:05 ahrtr

Comments:

  1. What do you mean by Sometimes, we may register our own gRPC service into etcd? Could you explain this?
  2. Why you do not expose the two items? i.e. config.go#L132
  1. See https://github.com/etcd-io/etcd/blob/bfb9aa42059d84ea60a4cddd75d7f3e94b27935c/server/embed/config.go#L302-L309
  2. I can also expose them if you want.

rleungx avatar May 25 '22 07:05 rleungx

@ptabor One item in my to do list is to sort out all the thresholds in etcd, including this one. Will discuss this in a separate session once my current task of refactoring lease is done.

ahrtr avatar Jun 13 '22 11:06 ahrtr

@ptabor One item in my to do list is to sort out all the thresholds in etcd, including this one. Will discuss this in a separate session once my current task of refactoring lease is done.

@ahrtr As this sound as a snowflake use-case (grpc shared between etcd and other service), I think we should not be in the business of looking at the limits... and allow programmers (only in embed server) to pass additional grpc settings, that etcd passes-through.

ptabor avatar Jun 14 '22 20:06 ptabor

@ahrtr As this sound as a snowflake use-case (grpc shared between etcd and other service), I think we should not be in the business of looking at the limits... and allow programmers (only in embed server) to pass additional grpc settings, that etcd passes-through.

Overall adding a generic grpcAdditionalServerOptions grpc.ServerOption[] looks good to me. But it doesn't resolve the concern that it may have impact on the build-in services provided by etcd itself.

The good side is flexibility, because users can set any additional options they want. The down side is it may have impact on the gRPC service provided by etcd itself.

Another perspective to think about this is that we should take all services, including etcd build-in services and users registered services, under control. Once users see any issue for the embedded etcd, they will also raise issue in etcd community. Based on the principle of equal responsibility and obligation, If we need to support such case, then we should can add more restriction on the services registered by users, instead of just providing flexibility without any restriction. So users shouldn't set any additional options, instead they should follow all the existing options from this perspective.

ahrtr avatar Jun 15 '22 01:06 ahrtr

The new feature of ServiceRegister was introduced in pull/7215.

Again, the good side of the PR is users can reuse the same port as etcd. But the bad side is it introduces additional unnecessary complexity to etcd. Personally I don't like the PR ( I mean 7215), and the better choice should be resolving it out of the box (such as providing a proxy/gateway in front of embedded etcd and users' gRPC service) instead of hacking etcd itself.

ahrtr avatar Jun 15 '22 02:06 ahrtr

@ahrtr As this sound as a snowflake use-case (grpc shared between etcd and other service), I think we should not be in the business of looking at the limits... and allow programmers (only in embed server) to pass additional grpc settings, that etcd passes-through.

Overall adding a generic grpcAdditionalServerOptions grpc.ServerOption[] looks good to me. But it doesn't resolve the concern that it may have impact on the build-in services provided by etcd itself.

The good side is flexibility, because users can set any additional options they want. The down side is it may have impact on the gRPC service provided by etcd itself.

Another perspective to think about this is that we should take all services, including etcd build-in services and users registered services, under control. Once users see any issue for the embedded etcd, they will also raise issue in etcd community. Based on the principle of equal responsibility and obligation, If we need to support such case, then we should can add more restriction on the services registered by users, instead of just providing flexibility without any restriction. So users shouldn't set any additional options, instead they should follow all the existing options from this perspective.

IMO, no matter if it's a requirement of user-defined services, we still need to provide the ability to change the default gRPC settings.

rleungx avatar Jun 15 '22 04:06 rleungx

Besides, supporting the custom interceptor for embed etcd is another need. See https://github.com/etcd-io/etcd/issues/13468

rleungx avatar Jun 15 '22 10:06 rleungx

Note that the users registered service and the etcd build-in gRPC services share the same gRPC channel. Any options will have impact on both of them.

Please just use the existing flag --max-request-bytes to change the MaxRecvSize. Do you need to change the MaxSendSize in your case?

ahrtr avatar Jun 17 '22 22:06 ahrtr

Hi @ahrtr https://github.com/etcd-io/etcd/pull/7215 not allow users to define gRPC's setting, I think this PR is to improve it.

We have some requirements with the embed etcd use-case, not only max-request-bytes :

  • register own service in etcd, and we want to protect with those services, but etcd's service may influence it (some client directly use etcd service), so we need QoS features. But same as the issue comment says, it suggests this can be simply implemented as a gRPC server middleware, and keep etcd as minimal as possible, but we cannot use custom gRPC server middleware
  • Fixed parameters bring some inconvenience because it is not extensible, we need to add supported parameters to etcd first, and then wait for it release to the stable version, and then update the dependencies. This time period is too long.

IMO, embed etcd users need to be responsible for changing etcd parameters, this does not affect normal etcd users(standalone process). So hopefully this PR will be approved, it is more convenient for embed etcd users. ptal @ahrtr @ptabor, Thanks.

nolouch avatar Jun 20 '22 13:06 nolouch

Hi @ahrtr #7215 not allow users to define gRPC's setting, I think this PR is to improve it.

We have some requirements with the embed etcd use-case, not only max-request-bytes :

* register own service in etcd, and we want to protect with those services, but etcd's service may influence it (some client directly use etcd service), so we need `QoS features`. But same as the [issue comment](https://github.com/etcd-io/etcd/pull/12290#issuecomment-697060195) says, it suggests this can be simply implemented as a gRPC server middleware, and keep etcd as minimal as possible, but we cannot use custom gRPC server middleware

* Fixed parameters bring some inconvenience because it is not extensible, we need to add supported parameters to etcd first, and then wait for it release to the stable version, and then update the dependencies. This time period is too long.

IMO, embed etcd users need to be responsible for changing etcd parameters, this does not affect normal etcd users(standalone process). So hopefully this PR will be approved, it is more convenient for embed etcd users. ptal @ahrtr @ptabor, Thanks.

I am still not convinced that we really need to add grpcAdditionalServerOptions grpc.ServerOption[]. I am not saying we shouldn't add it, instead I am just trying to say that we shouldn't add any new feature without a deep understanding the use cases beforehand. So please elaborate your use cases firstly. Why do you need it, and how will use it after adding it? Specifically, what gRPC options are you going to add via the grpcAdditionalServerOptions?

Note that there are several QoS related PRs, but none of them is merged so far. If your discussion is based on the QoS feature, then it isn't valid at the moment.

More flexibility also means more chance to make mistake. If there is no any strong incentive or reasonable benefit, then we shouldn't add it.

Please also make sure you well understood my previous two comments, issuecomment-1155881593 and issuecomment-1155894742

ahrtr avatar Jun 22 '22 11:06 ahrtr

@ahrtr:

  1. I'm not insisting on merging, this... just steering towards such variant rather than explosion of custom options.

  2. A use-case is linked from: https://github.com/etcd-io/etcd/pull/14066#discussion_r887444999, and seems direct consequence of allowing embed server to have custom customer's services being used. I think usage of embedded service put's a high bar on customer to have proper testing for their integrated solution. We should document this requirement more explicitly.

  3. I was hoping (but seems not possible currently) that we could perform validation of different constants/limit post-grpc server start, It's run it and obtain the effective config. Unfortunately grpc does not exposes the resolved configuration of running server publicly:

https://github.com/grpc/grpc-go/blob/28de4866ce7440b675662abbdd5c43b476bd4dae/server.go#L124

ptabor avatar Jun 22 '22 21:06 ptabor

Thanks @ptabor for the feedback.

I read the 14066#discussion_r887444999 before. etcd has already supported users registering custom service. If users need to configure the max request size, we already have MaxRequestBytes (MaxRecvMsgSize). If users need MaxSendMsgSize as well, we can add it (the default value is math.MaxInt32, do we really need to expose it?). Apart from that, are there any other options users might need?

FYI. there is a on-going PR 14081 for the MaxConcurrentStreams.

There are 24 options in total which are supported by gRPC 1.47 for now, see server.go#L143-L168 . Currently etcd exposes about half of them. Some options/API (such as InTapHandle, ConnectionTimeout, HeaderTableSize and NumStreamWorkers) are still experimental, which probably I don't think are proper to expose to etcd users. For other options (such as RPCCompressor, RPCDecompressor, InitialWindowSize, InitialConnWindowSize, WriteBufferSize and ReadBufferSize) which etcd hasn't exposed yet, I am not sure whether we should expose them. Personally I don't think it's good to just expose them all without deep understanding and full testing ourselves (the community) beforehand, and we do not receive real request on these so far.

  1. I'm not insisting on merging, this... just steering towards such variant rather than explosion of custom options.

The number of options will not explode. Please see my comment above.

  1. I think usage of embedded service put's a high bar on customer to have proper testing for their integrated solution. We should document this requirement more explicitly.

This might be a valid point.

ahrtr avatar Jun 23 '22 01:06 ahrtr

@ahrtr @ptabor Thanks for your reply. https://github.com/etcd-io/etcd/pull/14066#issuecomment-1163834967 make sense to me. But for now, we indeed have two requirements:

  1. support changing the MaxRecvMsgSize which can be replaced by changing max-request-bytes. It doesn't well conform to semantics but it works. I'm ok with the current usage.
  2. support adding the custom gRPC interceptor for user-defined gRPC services so we can add our logic to intercept the request and do something, e.g. the QoS feature or forwarding requests. But etcd doesn't support it at the moment.

rleungx avatar Jun 23 '22 04:06 rleungx

@ahrtr

Once users see any issue for the embedded etcd, they will also raise issue in etcd community. Based on the principle of equal responsibility and obligation, If we need to support such case, then we should can add more restrictions on the services registered by users, instead of just providing flexibility without any restriction.

I don't think it's good to just expose them all without deep understanding and full testing ourselves (the community) beforehand

I have some questions:

  • Community can issue that and add the option if approve, as I said before, users need to wait a long time. on the other hand, how do restriction users not use the inappropriate option value that may affect etcd?
  • As you said, currently etcd exposes about half of 24 options, so about 12 options. Do all users really deep understand those 12 options and full testing by yourselves with all use cases?

I think there still have the same problems. so if the embed user wants to change the default parameters, they should understand it, and be responsible for it. that means users should test it and make sure it is suitable for themselves scenarios. As @ptabor said, we can document it. And I don't think it introduces unnecessary complexity to etcd, etcd still keep minimal, and it's more flexibility, let user can extend etcd (such as grpc services and middlewares), and also it only affects embed users.

nolouch avatar Jun 23 '22 06:06 nolouch

BTW, If you really concert about it, I think etcd embed model should be more like a library, it can be as a service be registered to a custom grpc server rather than let itself as a server.

nolouch avatar Jun 23 '22 07:06 nolouch

I am OK to revisit this PR, but it needs at least three maintainers' approval.

ahrtr avatar Aug 08 '22 21:08 ahrtr

The GRPCAdditionalServerOptions may define duplicate items as the existing conf ones. It seems that GRPCAdditionalServerOptions overwrites the existing items, and it's basically OK. But GRPCAdditionalServerOptions only takes effect in the gRPC communication, but some existing items not only takes effect in the gRPC communication, but also handled by etcd, such as v3_server.go#L687.

So we need to get all these sorted out.

ahrtr avatar Aug 15 '22 00:08 ahrtr

The GRPCAdditionalServerOptions may define duplicate items as the existing conf ones. It seems that GRPCAdditionalServerOptions overwrites the existing items, and it's basically OK. But GRPCAdditionalServerOptions only takes effect in the gRPC communication, but some existing items not only takes effect in the gRPC communication, but also handled by etcd, such as v3_server.go#L687.

So we need to get all these sorted out.

It seems only MaxRequestBytes will affect the etcd itself.

rleungx avatar Aug 16 '22 03:08 rleungx

I think it makes more sense to make the existing configuration items overwrite the items defined in GRPCAdditionalServerOptions. For example, if users define both --max-request-bytes and []grpc.ServerOption{grpc.MaxRecvMsgSize(...)}, then the --max-request-bytes overwrites the latter.

ahrtr avatar Aug 16 '22 04:08 ahrtr

I think it makes more sense to make the existing configuration items overwrite the items defined in GRPCAdditionalServerOptions. For example, if users define both --max-request-bytes and []grpc.ServerOption{grpc.MaxRecvMsgSize(...)}, then the --max-request-bytes overwrites the latter.

How about adding some comments to emphasize that GRPCAdditionalServerOptions may overwrite the existing gRPC configs? Or is there any way to tell if the config is set by users without comparing it with the default value?

rleungx avatar Aug 24 '22 05:08 rleungx

The existing individual configurations should take precedence over the generic configurations grpcAdditionalServerOptions, and we should guarantee this from code level.

You just need to make sure the items in grpcAdditionalServerOptions are in the head of the option list, because the latter will overwrite the former if two items have the same key. Please see grpc.go#L72 and server.go#L563-L565

ahrtr avatar Sep 05 '22 07:09 ahrtr

The existing individual configurations should take precedence over the generic configurations grpcAdditionalServerOptions, and we should guarantee this from code level.

You just need to make sure the items in grpcAdditionalServerOptions are in the head of the option list, because the latter will overwrite the former if two items have the same key. Please see grpc.go#L72 and server.go#L563-L565

Actually, we have another need to add some new interceptors. If using this way, the options of interceptors will be overwritten I think.

rleungx avatar Sep 06 '22 03:09 rleungx

Previously I was thinking the existing individual configurations should take precedence over the generic configurations grpcAdditionalServerOptions, but it seems not a big deal, since the generic configuration will only be used by the users of embedded etcd.

Please consider to add a verification on the configuration. If there are duplicated items, then raise an error and fail the process.

ahrtr avatar Sep 13 '22 08:09 ahrtr

Please consider to add a verification on the configuration. If there are duplicated items, then raise an error and fail the process.

Do you mean the duplicated items inside grpcAdditionalServerOptions or between grpcAdditionalServerOptions and other gRPC-related configurations?

rleungx avatar Sep 14 '22 09:09 rleungx

Please consider to add a verification on the configuration. If there are duplicated items, then raise an error and fail the process.

Do you mean the duplicated items inside grpcAdditionalServerOptions or between grpcAdditionalServerOptions and other gRPC-related configurations?

It might be a little subtle. For example, MaxRequestBytes is used by both gRPC and etcdserver, and it's default value is 1.5MB. If you add grpc.MaxRecvMsgSize into grpcAdditionalServerOptions, it only affects the gRPC communication, and probably your customized interceptor, but not etcdserver.

If the grpc.MaxRecvMsgSize in grpcAdditionalServerOptions has a smaller value than MaxRequestBytes, then it means the max request what the etcdserver can receive is actually smaller than expected. If it has a bigger value, then a request may be accepted by gRPC, but eventually rejected by etcdserver. The latter case might be OK. Note that based on your current implementation, the generic configurations has higher priority than the existing individual configuration items.

Since grpcAdditionalServerOptions will only be used for etcd as embedded case, so the embed user should take the responsibility to provide proper configurations to make sure everything (including etcdserver and customized interceptors) work as expected.

Overall looks good to me. cc @serathius @spzala @ptabor

ahrtr avatar Sep 15 '22 22:09 ahrtr

Please rebase this PR, and squash the commits.

ahrtr avatar Sep 25 '22 20:09 ahrtr

The go version has already been bumped to 1.19.1 in pipeline (github workflow), so please rebase this PR, and it should can resolve the Release workflow failure.

ahrtr avatar Oct 02 '22 23:10 ahrtr

@ptabor PTAL

rleungx avatar Oct 24 '22 03:10 rleungx