cosmos-sdk icon indicating copy to clipboard operation
cosmos-sdk copied to clipboard

docs: update ADR 038 proposal

Open egaxhaj opened this issue 3 years ago • 18 comments

For #10096

This PR introduces updates to ADR-038 for the plugin-based streaming services. These updates reflect the implementation approach taken in #13472


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.

I have...

  • [ ] included the correct type prefix in the PR title
  • [ ] added ! to the type prefix if API or client breaking change
  • [ ] targeted the correct branch (see PR Targeting)
  • [ ] provided a link to the relevant issue or specification
  • [ ] followed the guidelines for building modules
  • [ ] included the necessary unit and integration tests
  • [ ] added a changelog entry to CHANGELOG.md
  • [ ] included comments for documenting Go code
  • [ ] updated the relevant documentation or specification
  • [ ] reviewed "Files changed" and left comments if necessary
  • [ ] confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.

I have...

  • [ ] confirmed the correct type prefix in the PR title
  • [ ] confirmed ! in the type prefix if API or client breaking change
  • [ ] confirmed all author checklist items have been addressed
  • [ ] reviewed state machine logic
  • [ ] reviewed API design and naming
  • [ ] reviewed documentation is accurate
  • [ ] reviewed tests and test coverage
  • [ ] manually tested (if applicable)

egaxhaj avatar Oct 06 '22 23:10 egaxhaj

Actually, let me ask since I didn't think of this prior... @egaxhaj is this meant to replace or improve the existing design?

If it's meant to replace, that changes my opinion entirely. It's just not clear from the diff/PR, if your changes replace the design? If so, I actually think it's pretty clean 👍

alexanderbez avatar Oct 11 '22 17:10 alexanderbez

Overall question, I might be missing something; what does the usage of hashicorp/go-plugin provide us with that simple streaming grpc implementation does not?

kocubinski avatar Oct 12 '22 02:10 kocubinski

Hi, maybe the issue #13457 should be addressed in the adr as well, basically exact when and what to listen for the kv writes.

yihuang avatar Oct 12 '22 15:10 yihuang

Overall question, I might be missing something; what does the usage of hashicorp/go-plugin provide us with that simple streaming grpc implementation does not?

Agree here. We already use gRPC extensively in the SDK. Why can't we just create a streaming gRPC version of the existing StreamingService and avoid a lot of these changes?

aaronc avatar Oct 12 '22 15:10 aaronc

Overall question, I might be missing something; what does the usage of hashicorp/go-plugin provide us with that simple streaming grpc implementation does not?

Agree here. We already use gRPC extensively in the SDK. Why can't we just create a streaming gRPC version of the existing StreamingService and avoid a lot of these changes?

Fundamentally a go-plugin is indeed just a gRPC (or net/rpc) service you communicate with. The reason to use the package instead of doing it yourself is to take advantage of the value-add features it gives you beyond core RPC mechanics. The README gives an overview but I think the most relevant points are

Complex arguments and return values are supported. This library provides APIs for handling complex arguments and return values such as interfaces, io.Reader/Writer, etc. We do this by giving you a library (MuxBroker) for creating new connections between the client/server to serve additional interfaces or transfer raw data.

and

Plugins are very easy to install: just put the binary in a location where the host will find it (depends on the host but this library also provides helpers), and the plugin host handles the rest.

which are I guess pretty important if you want a nice user experience, quite subtle to build correctly (not something the SDK should attempt to do in-house, certainly), and IMO more than make up for the cost of admission, so to speak, of the dependency.

peterbourgon avatar Oct 12 '22 18:10 peterbourgon

Overall question, I might be missing something; what does the usage of hashicorp/go-plugin provide us with that simple streaming grpc implementation does not?

Agree here. We already use gRPC extensively in the SDK. Why can't we just create a streaming gRPC version of the existing StreamingService and avoid a lot of these changes?

After catching up with @egaxhaj on today's call it's clear that gRPC streaming is not viable since there is a need to synchronously handle streaming faults and stop the node. Another option though: create a streaming listener which synchronously makes an unary RPC call to a configurable gRPC endpoint, would this be so bad?

kocubinski avatar Oct 20 '22 17:10 kocubinski

ref: #13473 (comment) I think we need to specify how plugins's lifecycle managed:

I'll make sure to address this in the ADR. Also, regarding handshakes, take a look at the tutorial for Go plugins and non-Go plugins, if you have not done so already. It is also useful to watch this youtube video on how the plugin system works (including plugin lifecycles). It is 50 min long but worth the watch.

  • can plugin reattach at runtime?

You can reload plugins by calling NewStreamingPlugin but there is no tooling build into this proposal that will reattach plugins. This is only useful in fire-and-forget when StopNodeOnErr=false.

  • when node stopped, do we stop plugin automatically.

I'll need to double check this but yes.

  • when plugin quit at runtime, should we quit node as well

Managed by StopNodeOnErr. I'll put more wording on this in the ADR.

egaxhaj avatar Oct 21 '22 18:10 egaxhaj

@egaxhaj, @yihuang created a PR which implements an improved file listener, that streams block metadata (BeginBlock and EndBlock requests/responses) and txs.

Is this something that would suite your needs? If so, do we still need the modifications you're proposing in this PR? @yihuang What do you think?

alexanderbez avatar Oct 24 '22 14:10 alexanderbez

@egaxhaj, @yihuang created a PR which implements an improved file listener, that streams block metadata (BeginBlock and EndBlock requests/responses) and txs.

Is this something that would suite your needs? If so, do we still need the modifications you're proposing in this PR? @yihuang What do you think?

Yes we need the modifications proposed in this PR. The file listener alone does not suit our needs.

#13516 is addressing a bug when listening to state change writes which we'll need to also address in this PR.

egaxhaj avatar Oct 24 '22 15:10 egaxhaj

@egaxhaj, @yihuang created a PR which implements an improved file listener, that streams block metadata (BeginBlock and EndBlock requests/responses) and txs. Is this something that would suite your needs? If so, do we still need the modifications you're proposing in this PR? @yihuang What do you think?

Yes we need the modifications proposed in this PR. The file listener alone does not suit our needs.

#13516 is addressing a bug when listening to state change writes which we'll need to also address in this PR.

~~Is this modifications needed at the listening level or at the plugin level?~~

What is missing from the file system?

trying to understand your needs here since it seems reading the file system provides the best consistency guarantees

tac0turtle avatar Oct 24 '22 15:10 tac0turtle

What is missing from the file system?

trying to understand your needs here since it seems reading the file system provides the best consistency guarantees

I think this proposal is mainly about the plugin system, where one can dynamic load some out-of-process plugins. But yeah, it seems reading from the file streamer output fulfill the same purpose, at least for the "async" use cases, effectively using the file system as a message queue (using inotify to get realtime events). The only thing is one can't "stopNodeOnErr" with this approach, for those use cases, one can create the in-process StreamingService instances.

yihuang avatar Oct 25 '22 01:10 yihuang

What is missing from the file system? trying to understand your needs here since it seems reading the file system provides the best consistency guarantees

I think this proposal is mainly about the plugin system, where one can dynamic load some out-of-process plugins. But yeah, it seems reading from the file streamer output fulfill the same purpose, at least for the "async" use cases, effectively using the file system as a message queue (using inotify to get realtime events). The only thing is one can't "stopNodeOnErr" with this approach, for those use cases, one can create the in-process StreamingService instances.

something like this: https://github.com/cosmos/cosmos-sdk/issues/13652

yihuang avatar Oct 26 '22 03:10 yihuang

@yihuang @peterbourgon @alexanderbez @kocubinski I pushed a new update to the ADR based on peer review and the proposed changes from #13516.

egaxhaj avatar Nov 01 '22 15:11 egaxhaj

quick update... working on addressing the latest round of questions.

egaxhaj avatar Nov 16 '22 18:11 egaxhaj

@yihuang

  • I have reservations about the plugin system though, it seems pretty complicated, and I think there's alternative that can be implemented outside of sdk: #13652

What are you seeing as complicated? The plugin system works over gRPC. the SDK uses gRPC so nothing new here. The plugin system goes one step further and makes it easer for you to implement plugins in Go. Take a look at the plugin-go examples. There are three steps involved.

  1. implement the service interface (ABCIListeneer). See file.go
  2. Compile the Go plugin
go build -o streaming/plugins/abci/v1/examples/plugin-go/file streaming/plugins/abci/v1/examples/plugin-go/file.go
  1. export it so it can be loaded
export COSMOS_SDK_ABCI_V1=.../streaming/plugins/abci/v1/examples/plugin-go/file

COSMOS_SDK is a prefix here and ABCI_V1 is the plugin name streaming.abci.plugin = "abci_v1"

For non Go plugins you implement the gRPC server. See plugin-python examples.

checkout egaxhaj/adr-038-grpc2 from provenance-io/cosmos-sdk and swap between plugin Impls

make test-sim-nondeterminism-streaming

# Go - file (writes to ~/)
export COSMOS_SDK_ABCI_V1=<path to sdk>/cosmos-sdk/streaming/plugins/abci/v1/examples/plugin-go/file

# python - file (writes to ~/)
export COSMOS_SDK_ABCI_V1=python3 <path to sdk>/cosmos-sdk/streaming/plugins/abci/v1/examples/plugin-python/file.py

# python - Kafka
export COSMOS_SDK_ABCI_V1=python3 <path to sdk>/cosmos-sdk/streaming/plugins/abci/v1/examples/plugin-python/kafka.py

Dependencies: python3 -m pip install confluent-kafka grpcio-tools grpcio-health-checking


  • The issues about multiple service registration seems not addressed yet: #13473 (comment)

Addressed.


  • How about I move the multistore and ABCIListener API changes into #13516, so we change the implementation together with the ADR, and leave this PR solely about the plugin system. Don't know how that affect the back-portability of #13516 though.

Let's keep them separate. Once #13516 is merged I can update the ADR as needed.

egaxhaj avatar Nov 18 '22 21:11 egaxhaj

The issues about multiple service registration seems not addressed yet: https://github.com/cosmos/cosmos-sdk/pull/13473#discussion_r1003455285 Addressed.

I'm confused, I see you updated the code to register multiple plugins, what I mean is the current StreamingService interface, it's ABCIListener now, and we can't register multiple ones of those in the new proposal, right? out-of-process plugins don't work for our case as I mentioned here, we still need the registration of custom ABCIListener(StreamingService).

yihuang avatar Nov 18 '22 23:11 yihuang

The issues about multiple service registration seems not addressed yet: #13473 (comment) Addressed.

I'm confused, I see you updated the code to register multiple plugins, what I mean is the current StreamingService interface, it's ABCIListener now, and we can't register multiple ones of those in the new proposal, right? out-of-process plugins don't work for our case as I mentioned here, we still need the registration of custom ABCIListener(StreamingService).

StreamingService went away because there's no longer a need for the additional methods it was providing. versiondb streaming service is not using them either. It only cares about ListenBeginBlock and ListenCommit

Change versiondb StreamingService to implement ABCIListener.

var baseapp.ABCIListener = &StreamingService{}

https://github.com/crypto-org-chain/cronos/blob/936cc905f810a4381584da68418dad33099f2279/versiondb/streaming_service.go#L15

Registration is under your control. Continue to do what you're already doing (don't use the plugin system registration loop in this proposal).

# app.go

...

streamers := cast.ToString(appOpts.Get("streaming.abci.plugin"))
if strings.Contains(streamers, "file") {
  ...
}

if strings.Contains(streamers, "versions") {
  service := versiondb.NewStreamingService(versionDB, exposeStoreKeys)
  bApp.SetStreamingService(service)
  ...
}

*I need to put back SetStreamingService()

[streaming]

[streaming.abci]

plugin="file,versions"
keys=[]

# in your case you ignore it.
stop-node-on-err=false

After looking at versiondb, I understand now why you asked to not run a goroutine when stop-node-on-err=false (ref: https://github.com/cosmos/cosmos-sdk/pull/13473#discussion_r1022204663). We can cover that use case by adding streaming.abci.async=true|false. When async=false and stop-node-on-err=false it will run without a goroutine.

edit: I'll make the updates to the ADR and post today.

edit: @yihuang I've updated the ADR to support your in-process service. You'll need to make the modifications I mentioned above but nothing major.

Registration post merge:

# app.go

pluginKey := fmt.Sprintf("%s.%s.%s", baseapp.StreamingTomlKey, baseapp.StreamingABCITomlKey, baseapp.StreamingABCIPluginTomlKey)
streamers := cast.ToString(appOpts.Get(pluginKey))
if strings.Contains(streamers, "versiondb") {
    ...
    service := versiondb.NewStreamingService(versionDB)
    bApp.SetStreamingService(service)
    bApp.cms.AddListeners(exposeStoreKeys)
    ...
}

egaxhaj avatar Nov 21 '22 18:11 egaxhaj

@alexanderbez @kocubinski @tac0turtle - I made updates that continues to support @yihuang in-process use case while still moving forward with the plugin system. Can you take another look at the ADR and if all looks good merge it?

egaxhaj avatar Nov 21 '22 22:11 egaxhaj

@tac0turtle This has three approval. Can we merge it and move onto reviewing the the implementation #14207?

egaxhaj avatar Dec 12 '22 21:12 egaxhaj

@yihuang are you OK with merging this PR? I noticed you're still requesting changes/review.

alexanderbez avatar Dec 12 '22 21:12 alexanderbez

@yihuang any objections to moving forward and merging this PR?

egaxhaj avatar Dec 16 '22 19:12 egaxhaj