accumulo
accumulo copied to clipboard
DISCUSS add resource groups to mini accumulo
Posting this draft PR solely for the purpose of discussion and feedback, it does not actually implement anything. The elasticity branch is heavily using the concept of servers organized into resource groups. MiniAccumulo's public api does not support resource groups and there are comments about this in the elasticity branch here. The changes in the PR were an attempt to solve the following problems.
- Mini Accumulo public API does not support resource groups.
- Accumulo uses Mini Accumulo to test itself and this internal code for testing does not consistently support resource groups or making starting and stopping servers in resource groups easy.
The changes in the PR stub out external and internal API changes that attempt to address the problems above. However there is no implementation of these stubbed out APIs. The following is guide to the changes so far.
- The change in the external API were made in o.a.a.minicluster.*
- An example of code that uses the new public API is in MiniAccumuloResourceGroupsTest
- The changes to internal API used for testing were made in ClusterControl
- Some examples of using the new internal API can be seen in a few ITs
One thing I do not like about these change is that there is currently a runtime exception if server processes are placed outside of the default resource group. This seems a bit tricky, not sure how to improve it.
One thing I do like about these changes is the internal API may make it easier to start and stop processes in resource groups in a consistent way in ITs. So the way that process would be start for a tserver RG is the same as a compactor RG. I also like the mode of specifying a new expected state and processes being started and/or stopped based on how the current state and the new state align.
One problem area for this PR may be that tserver resource groups do not exists in main, but only in elasticity. This is a similar problem to #4220.
Hoping changes that address the problems mentioned will eventually support testing of multiple manager in a consistent way. This is assuming that elasticity eventually supports multiple active managers.
Another goal of these changes is to deprecate Mini Accumulo APIs in 3.X so those could be remove in 4.0.0.
@keith-turner - I assume you have seen ClusterServerConfiguration in the elasticity branch? I didn't add it to the public API when I created it, but you could add an interface for it in elasticity, then backport it to main.
@keith-turner - I assume you have seen ClusterServerConfiguration in the elasticity branch? I didn't add it to the public API when I created it, but you could add an interface for it in elasticity, then backport it to main.
Yeah I am familiar with that and this conceptually builds on it, but moves from a mutable config model to a model of an immutable config. I had not looked at that specific class recently, looking at it I changed some of the type and method names in this PR and that solved a problem I was having with being able to specify config that did not make sense. Overall, still not sure about some of the names, initially my main focus was exploring this concept of immutable config so have not thought too much about the names and other aspects.
Hoping that the model of having an immutable sever config w/ getters and setters in ClusterControl will allow us to simplify how tests start and stop servers. Also hoping that we can remove most of the methods in ClusterControl related to starting and stoping servers (I added TODOs in this PR about removing them). Currently in elasticity the code will get a reference to the mutable cluster server config, change it, and then call methods on ClusterControl to start/stop servers. Hoping this streamlines that a bit for our ITs.
In 3.1.0 we could only update MiniAccumulo public API, but not change any internal test code. Later in elasticity we could optionally update ClusterControl to use the new immutable server config public API type for internal testing purposes.