skywalking-swck icon indicating copy to clipboard operation
skywalking-swck copied to clipboard

[Feature] Add BanyanDB CRD & Controllers

Open SzyWilliam opened this issue 2 years ago • 8 comments

Original Issue

https://github.com/apache/skywalking/issues/9396

What changes brought in this PR

In this PR, I add CRD for BanyanDB.

  1. Use StatefulSet as BanyanDB's underlying controller, so that we can benefit from its stable network identifiers or storage, plus future extensions.
  2. Use Service as BanyanDB's networking identifier for internal access & communication.
  3. Use LocalPV as BanyanDB's default storage choice.

SzyWilliam avatar Jul 29 '22 03:07 SzyWilliam

I can't see an e2e to verify this. SWCK should be verified whether it really could deploy.

wu-sheng avatar Jul 29 '22 04:07 wu-sheng

@wu-sheng @lujiajing1126 This is a draft pr for the osapp project. The student @SzyWilliam should still be coding, but I'm not familiar with some features of Banyandb, so could you please confirm whether the Spec field can meet all the current banyandb requirements?

dashanji avatar Jul 29 '22 04:07 dashanji

Then this PR should be labeled aa draft or use issue or discussion panel to talk. PR could trigger mail notifications for reviewers.

wu-sheng avatar Jul 29 '22 05:07 wu-sheng

BTW, I think you ping a wrong people.

wu-sheng avatar Jul 29 '22 05:07 wu-sheng

@wu-sheng @lujiajing1126 This is a draft pr for the osapp project. The student @SzyWilliam should still be coding, but I'm not familiar with some features of Banyandb, so could you please confirm whether the Spec field can meet all the current banyandb requirements?

As I understand, since the current version of BanyanDB only has a single node, Deployment can meet our requirement.

The operator for VictoriaMetrics and its VMSingle CRD can be a good reference for us.

lujiajing1126 avatar Jul 29 '22 12:07 lujiajing1126

As I understand, since the current version of BanyanDB only has a single node, Deployment can meet our requirement. @lujiajing1126 Thanks for your reply! If we consider future extensions, a StatefulSet will be great for a BanyanDB cluster since it provides bindings on pv & network identifier. Why don't we use StatefulSet now?

SzyWilliam avatar Jul 29 '22 12:07 SzyWilliam

@lujiajing1126 Thanks for your reply! If we consider future extensions, a StatefulSet will be great for a BanyanDB cluster since it provides bindings on pv & network identifier. Why don't we use StatefulSet now?

Actually, I'm neutral about the choice. I would just say Deployment is definitely much simpler and thus has lower cost. The cluster version is on our roadmap but with very few details. It is not possible for us now to cover all what we need in the future.

lujiajing1126 avatar Jul 29 '22 12:07 lujiajing1126

Deployment is better here. The cluster mode will know each node's roles, we don't leverage statefulset to get stable and ordered network endpoints and pv.

hanahmily avatar Jul 29 '22 13:07 hanahmily

@dashanji @hanahmily @lujiajing1126 Hi, I've completed a minimal working BanyanDB CRD&Controller for SWCK. It passes e2e tests. Please take a look at this, thanks! There are still some advanced features to be added, like TLS config for banyandb. I plan to add these features in future PRs.

SzyWilliam avatar Sep 07 '22 06:09 SzyWilliam

What is the storage volume mount policy?

@hanahmily When we use this on the demo deployment, we need to make sure storage volume could mounted back when pod reboot(this happens time to time on our demo GCP.

wu-sheng avatar Sep 07 '22 06:09 wu-sheng

What is the storage volume mount policy?

@hanahmily When we use this on the demo deployment, we need to make sure storage volume could mounted back when pod reboot(this happens time to time on our demo GCP.

At present, we provide the PVC API for users. If users create a Persistent Volume, the PVC will find the PV and the data will be stored there.

dashanji avatar Sep 10 '22 06:09 dashanji

@dashanji Thanks for the reviews. @hanahmily I am trying to write e2e tests for swck banyandb. Could you please tell me how I can query banyandb pod using client tools like cli?

SzyWilliam avatar Sep 13 '22 05:09 SzyWilliam

@dashanji Thanks for the reviews. @hanahmily I am trying to write e2e tests for swck banyandb. Could you please tell me how I can query banyandb pod using client tools like cli?

bydbctl is WIP. you could query data from OAP instead of the banyandb server as https://github.com/apache/skywalking-banyandb/tree/main/test/e2e-v2 .

hanahmily avatar Sep 13 '22 05:09 hanahmily

@SzyWilliam LGTM, thanks for your patience and hard work! Hope you will continue to improve it in the next PRs.

dashanji avatar Sep 19 '22 15:09 dashanji

@dashanji Thanks for your reviews! I've completed the full banyandb crd & controller implementation with corresponding e2e tests. @wu-sheng @hanahmily @lujiajing1126 Please take a look at it~🫰

SzyWilliam avatar Sep 20 '22 07:09 SzyWilliam

@hanahmily Thanks for the reviews! I've exposed HTTP port and added an ingress resource to route HTTP requests, PTAL. I have a question: according to banyandb github page, we do not support http server for now. Does banyandb support HTTP server now? If so, how I can query data using HTTP?

SzyWilliam avatar Sep 21 '22 05:09 SzyWilliam

we do not support http server for now. Does banyandb support HTTP server now?

We supported the HTTP server in the main branch. It will be released in v0.2.0.

If so, how I can query data using HTTP?

bydbctl will take it. The primitive commands will be released in v0.2.0 with the HTTP server.

And you could access the HTTP endpoint directly. The API specifications are at https://github.com/apache/skywalking-banyandb/blob/main/api/proto/openapi

hanahmily avatar Sep 21 '22 06:09 hanahmily

@hanahmily Thanks for the reply and providing framework on Service. I'll use separate gRPC Service and HTTP service. I plan to use bydbctl to verify banyandb data storage in E2E Tests after 0.2.0 release, so I'll just add a todo for now.

SzyWilliam avatar Sep 21 '22 06:09 SzyWilliam

I've made changes on code:

  1. separate grpc & http into two service resources, expose 17912/17913 respectively.
  2. remove unused ports 2121 6060. @hanahmily Please take a look at it again, thanks!

SzyWilliam avatar Sep 22 '22 09:09 SzyWilliam

It is great to see this PR is getting ready after months of work. Congrats.

wu-sheng avatar Sep 23 '22 01:09 wu-sheng

@lujiajing1126 We need your confirmation here. Please recheck.

wu-sheng avatar Sep 23 '22 04:09 wu-sheng

@lujiajing1126 Thanks for the careful review! I've made changes on code, please take a look again.

SzyWilliam avatar Sep 24 '22 01:09 SzyWilliam