Accumulo Monitor cannot handle the truth
A system with 200 tservers, and an incorrect table.context set, will start spamming the monitor with so many messages that we cannot keep the monitor up and running as it will run out of heap very quickly even with 16G of memory. This is with a 2.1 accumulo instance. In the 1.x versions the monitor we had no issues handling this volume of messages.
A jmap -histo just before it falls over shows characters arrays being the majority of the problem followed by ArrayTrie objects and byte arrays. The top 5 entries were:
instances : bytes : classname 466527 : 15277159512 : [C ([email protected]) 155314 : 638615624 : [Lorg.eclipse.jetty.util.ArrayTrie$Node; 988606 : 119096256 : [B ([email protected]) 1692806 : 54169792 : java.util.concurrent.locks.ReentrantLock$NonfairSync ([email protected]) 1848284 : 44358816 : java.lang.String ([email protected])
I will have to say I was supprised to see many of the threads using glassfish in the jstack. That might be a somewhat expensive servlet infrastructure to use in the monitor.
java 11.0.15 CentOS 7.5
Start up a system with 150+ tservers, and a monitor with 16g ram and lots of data across many tables. Set the table context to something bogus and watch the monitor quickly run out of memory. I am sure we could create a smaller system to simulate this.
I believe glassfish is in the stack for Jersey. As the reference implementation for Jersey, the glassfish implementation seemed to be the best supported and easiest to use. That was added to support the REST/AJAX based site that provides a more responsive user experience when running with many servers (no longer required to do full page reloads, and better retaining client-side state, like table sorting). It's definitely a far cry better than the manual servlets we used to maintain before. I don't think we'll want to go backwards and get rid of that, but there might be some trade-offs we're making here where we can afford to improve, or maybe there's an implementation that has a smaller footprint.
If I understand you correctly, this pertains to the optional feature to forward warnings/error messages to the monitor? For some of that, our hands were tied in the implementation, because we had to switch to using an implementation that sent messages using the REST endpoint, in order to avoid severe CVEs in log4j's socket receiver.
I'm not sure exactly how you have logging configured, but by default, our config ships using log4j2's async loggers. There may be some configuration to tweak their send behavior to throttle them, so they don't overwhelm the monitor: https://logging.apache.org/log4j/2.x/manual/async.html
For example, you have the option to configure filters to log4j2 to deduplicate repeated messages or to control bursts (you can also write a custom filter for your specific needs). See https://logging.apache.org/log4j/2.x/manual/filters.html
You can also choose to completely turn off the monitor appender, and rely on some other more robust log collection/analysis that is suitable for a large system. That's probably what I would recommend anyway, because the monitor will only keep a small window of recent warnings/errors, and on a large system, important things can easily scroll out and be missed.
Given that a lot of this is based on the environment, and tweakable by users, I don't think I would characterize this as a bug (it has the flavor of being susceptible to a denial-of-service attack, rather than an explicit bug). There seems to be many options on the log sending side, but I'm not sure what we can do on the log receiving side inside the monitor to avoid being overwhelmed. If there's something specific we can achieve within the monitor to avoid being overwhelmed, I'd be interested, but I'm not sure what options are available to us on the server-side. We definitely can't go back to the socket appender from log4j 1.2, since that's no longer available to us. We could run a separate Thrift service on the monitor, but that comes with its own problems/trade-offs, and I'm not sure it would ultimately fix the problem, since that too could be overwhelmed.
So we had another situation with #3909 that could be considered partially because of this ticket. In that case it was not only the monitor that could not handle the truth, but the tservers as well. The monitor fell over, and the node became unaccessible because of the nober if IRQ requests hitting it from the tservers. Many tservers fall over because of out of memory and many others had various threads die (out of memory) but remained running. The result in the end was the entire system had to be brought down hard.
So, probably the best thing that can be done for this ticket is not to modify the monitor, but rather focus on the number of messages being sent by the tservers. Even if there is a loop that is spamming messages (on the order of 1 per millisecond in that case), we should not be trying to send every single message to the monitor. There needs to be some dedupping of messages done on the tserver side if possible and perhaps the queue size of messages being sent to the logger needs to be restricted in size. I don't know if we can do this given we are using the standard logging infrastructure such as log4j but it should be investigated.
So, probably the best thing that can be done for this ticket is not to modify the monitor, but rather focus on the number of messages being sent by the tservers. Even if there is a loop that is spamming messages (on the order of 1 per millisecond in that case), we should not be trying to send every single message to the monitor. There needs to be some dedupping of messages done on the tserver side if possible and perhaps the queue size of messages being sent to the logger needs to be restricted in size. I don't know if we can do this given we are using the standard logging infrastructure such as log4j but it should be investigated.
This kind of thing can definitely done... I've already linked to the log4j docs on filters above that are capable of customizing the logging in the way you're interested in doing. If an existing filter isn't suitable, the APIs are very simple, and it should be easy to create a custom filter that keeps logs from flooding your systems based on your specific requirements.