SimpleLoadBalancer logging too many outstanding migrations
The SimpleLoadBalancer logs too many outstanding migrations. I don't know why this is happening but anytime there is an outstanding migration, it will spam the Manager log with too many log messages.
:17,605 [balancer.SimpleLoadBalancer] WARN : Not balancing due to 8 outstanding migrations. 2022-02-17T12:44:17,605 [balancer.SimpleLoadBalancer] DEBUG: Sample up to 10 outstanding migrations: 30;d;c, 30
It appears that the balancer is getting called for each tablet, hence why so many repetitive messages. This was supposed to be fixed with the creation of a new class ThrottledBalancerProblemReporter in #1891 but it doesn't look like the reporting is working properly.
@brianloss any interest in taking a look at this issue?
I saw this TODO in the code. https://issues.apache.org/jira/browse/ACCUMULO-2938
@milleruntime - I wonder if the issue is that the Problem key is being removed from the WeakHashMap in ThrottledBalancerProblemReporter.problemReportTimes by the garbage collector, so ThrottledBalancerProblemReporter.reportProblem is always going to report the problem. Changing the map type, as a test, should confirm whether or not this is the case.
I saw this TODO in the code. https://issues.apache.org/jira/browse/ACCUMULO-2938
Is it still applicable? If not, can just delete the TODO. Or at least, seems like a separate issue.
I saw this TODO in the code. https://issues.apache.org/jira/browse/ACCUMULO-2938
Is it still applicable? If not, can just delete the TODO. Or at least, seems like a separate issue.
I think we could delete the TODO and close the JIRA issue. It is a little too generic and we haven't seen any leakage.
How does encryption handle keys - are they encrypted? If they are not, then anything that prints a key has the potential to reveal something that is sensitive. Not sure there is a way around it, but logs and commands that can reveal keys need to be treated as sensitive.
An example would be if someone (probabily against best practices) used SSNs as a key - any thing that printed that key would need to treat the output as containing personal identifying information.
I think the key to this issue was that I had 2 tservers configured using Uno and the balancer went crazy trying to balance between the 2 with limited resources.
I've definitely seen this before. I wonder if it's possible for the balancer to never stabilize.
I've definitely seen this before. I wonder if it's possible for the balancer to never stabilize.
I was thinking that as well.