linstor-server
linstor-server copied to clipboard
Error reports rotation question
Hi, not sure if this expected behavior, so better to report this.
Due to bug of containerd some of our nodes were overfilled by the unterminated processes:
fork failed: Resource temporarily unavailable
linstor-satellite also generated many similar error-reports in the logs
# du -hs /var/log/linstor-satellite/
3.5G /var/log/linstor-satellite/
one of them:
ERROR REPORT 60926587-31C54-509538
============================================================
Application: LINBIT? LINSTOR
Module: Satellite
Version: 1.12.2
Build ID: 72244c7d40ba34808024a2c75da1d736dfd2e54e
Build time: 2021-05-04T12:53:07+00:00
Error time: 2021-07-03 06:20:52
Node: m5c6
Peer: 10.36.128.186:57816
============================================================
Reported error:
===============
Category: Exception
Class name: SSLException
Class canonical name: javax.net.ssl.SSLException
Generated at: Method 'createSSLException', Source file 'Alert.java', Line #133
Error message: closing inbound before receiving peer's close_notify
Error context:
I/O exception while attempting to receive data from the peer
Call backtrace:
Method Native Class:Line number
createSSLException N sun.security.ssl.Alert:133
createSSLException N sun.security.ssl.Alert:117
fatal N sun.security.ssl.TransportContext:336
fatal N sun.security.ssl.TransportContext:292
fatal N sun.security.ssl.TransportContext:283
closeInbound N sun.security.ssl.SSLEngineImpl:733
doHandshake N com.linbit.linstor.netcom.ssl.SslTcpConnectorHandshaker:118
read N com.linbit.linstor.netcom.ssl.SslTcpConnectorPeer:162
run N com.linbit.linstor.netcom.TcpConnectorService:543
run N java.lang.Thread:829
END OF ERROR REPORT.
Ah, my bad, this issue is not related to the today's incident, this is just old bug reports. I need to implement some log rotation to the kube-linstor project.
Not sure, how can I rotate the error-report database on the satellites?
# du -hs /var/log/linstor-satellite/error-report.mv.db
1.4G /var/log/linstor-satellite/error-report.mv.db
We currently store the error reports still as text files and within a DB. But if you want to also have the DB as backup, you can simply copy and compress it to some archive and then use linstor error-reports delete
with its various parameters to get rid of old error-report entries.
@rp- thank you for the information! Is there any opportunity to list and purge error reports on the satellites the same way as on controller?
@rp- thank you for the information! Is there any opportunity to list and purge error reports on the satellites the same way as on controller?
There is no tool yet for this, but it should be possible with the h2 binary tools and simply executing SQL statements. https://www.h2database.com/html/download.html