hive-driver icon indicating copy to clipboard operation
hive-driver copied to clipboard

Are there any benchmarks on the load tests done on this driver

Open ntallapa12 opened this issue 4 years ago • 1 comments

I am using this driver for couple of weeks now and am really happy with its overall stability and performance. I am curious to know if there are any load tests done as part of your test suites.

Few things I am looking at are: How long can a single hiveClient can live, is it ok if we create and leave it open for several days and keep opening/closing sessions/operations whenever we want. How many clients can we open in parallel? How many sessions can we open in parallel on a single hiveClient? Will there be a need to open another client for the above reason? How many operations can be done on a single session? Can we do operations in parallel or will they have to be serial in oder?

I know this may also rely on the HS2 server capacity but on some standard docker instance I am looking for benchmarks. Appreciate your inputs on this.

Thank you.

ntallapa12 avatar Mar 30 '20 03:03 ntallapa12

Hi @ntallapa12 , actually I didn't do load testing, so it is difficult to answer all your questions, but what I can say is next:

  1. Each instance of HiveClient is a tcp connection (in case of binary transport), so as long the server is live the connection will be alive

  2. It depends on how many tcp connections can be opened in one time, not sure if there a restriction. Probably, you can find something in the documentation

  3. I think, that for sessions there is a HiveServer2 option in the config, but I cannot remember which one. Each time you open session, the driver sends a request on the server which returns sessionHandle, which is used to run commands (https://github.com/lenchv/hive-driver/blob/master/lib/HiveSession.ts). That is you can create as many sessions, as the server allows

  4. Probably, if you have high load you should have a few live connections, but I have not experienced this before. As you can understand it depends on the capacity of your local network between web server and hive server.

  5. I think you can run them in parallel, there is a flag { runAsync: true } which says server to execute an operation asynchronously. The same to a session, when you run a command, the driver sends a request which returns operationHandle, which is used to retrieve a result from the server.

Thank you for such a good case, probably I will put it to the backlog and do more detailed research later and create the corresponding doc

lenchv avatar Mar 30 '20 12:03 lenchv