accumulo icon indicating copy to clipboard operation
accumulo copied to clipboard

SimpleLoadBalancer logging too many outstanding migrations

Open milleruntime opened this issue 3 years ago • 10 comments

The SimpleLoadBalancer logs too many outstanding migrations. I don't know why this is happening but anytime there is an outstanding migration, it will spam the Manager log with too many log messages.

:17,605 [balancer.SimpleLoadBalancer] WARN : Not balancing due to 8 outstanding migrations.
2022-02-17T12:44:17,605 [balancer.SimpleLoadBalancer] DEBUG: Sample up to 10 outstanding migrations: 30;d;c, 30

milleruntime avatar Feb 17 '22 17:02 milleruntime

It appears that the balancer is getting called for each tablet, hence why so many repetitive messages. This was supposed to be fixed with the creation of a new class ThrottledBalancerProblemReporter in #1891 but it doesn't look like the reporting is working properly.

milleruntime avatar Feb 18 '22 12:02 milleruntime

@brianloss any interest in taking a look at this issue?

milleruntime avatar Feb 18 '22 12:02 milleruntime

I saw this TODO in the code. https://issues.apache.org/jira/browse/ACCUMULO-2938

milleruntime avatar Feb 18 '22 16:02 milleruntime

@milleruntime - I wonder if the issue is that the Problem key is being removed from the WeakHashMap in ThrottledBalancerProblemReporter.problemReportTimes by the garbage collector, so ThrottledBalancerProblemReporter.reportProblem is always going to report the problem. Changing the map type, as a test, should confirm whether or not this is the case.

dlmarion avatar Mar 25 '22 13:03 dlmarion

I saw this TODO in the code. https://issues.apache.org/jira/browse/ACCUMULO-2938

Is it still applicable? If not, can just delete the TODO. Or at least, seems like a separate issue.

ctubbsii avatar Mar 29 '22 16:03 ctubbsii

I saw this TODO in the code. https://issues.apache.org/jira/browse/ACCUMULO-2938

Is it still applicable? If not, can just delete the TODO. Or at least, seems like a separate issue.

I think we could delete the TODO and close the JIRA issue. It is a little too generic and we haven't seen any leakage.

milleruntime avatar Mar 29 '22 16:03 milleruntime

How does encryption handle keys - are they encrypted? If they are not, then anything that prints a key has the potential to reveal something that is sensitive. Not sure there is a way around it, but logs and commands that can reveal keys need to be treated as sensitive.

An example would be if someone (probabily against best practices) used SSNs as a key - any thing that printed that key would need to treat the output as containing personal identifying information.

EdColeman avatar Mar 29 '22 18:03 EdColeman

I think the key to this issue was that I had 2 tservers configured using Uno and the balancer went crazy trying to balance between the 2 with limited resources.

milleruntime avatar May 04 '22 15:05 milleruntime

I've definitely seen this before. I wonder if it's possible for the balancer to never stabilize.

ctubbsii avatar May 09 '22 19:05 ctubbsii

I've definitely seen this before. I wonder if it's possible for the balancer to never stabilize.

I was thinking that as well.

milleruntime avatar May 10 '22 10:05 milleruntime