influxdb
influxdb copied to clipboard
feat: Defer cleanup for log/index compactions, add debug log
I believe that there is something happening which causes CurrentCompactionN() to always be greater than 0. Thus making Partition.Wait() hang forever.
Taking a look at some profiles where this issue occurs. I'm seeing a consistent one where we're stuck on Partition.Wait()
-----------+-------------------------------------------------------
1 runtime.gopark
runtime.chanrecv
runtime.chanrecv1
github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Wait
github.com/influxdata/influxdb/tsdb/index/tsi1.(*Partition).Close
github.com/influxdata/influxdb/tsdb/index/tsi1.(*Index).close
github.com/influxdata/influxdb/tsdb/index/tsi1.(*Index).Close
github.com/influxdata/influxdb/tsdb.(*Shard).closeNoLock
github.com/influxdata/influxdb/tsdb.(*Shard).Close
github.com/influxdata/influxdb/tsdb.(*Store).DeleteShard
github.com/influxdata/influxdb/services/retention.(*Service).DeletionCheck.func3
github.com/influxdata/influxdb/services/retention.(*Service).DeletionCheck
github.com/influxdata/influxdb/services/retention.(*Service).run
github.com/influxdata/influxdb/services/retention.(*Service).Open.func1
-----------+-------------------------------------------------------
Defer'ing compaction count cleanup inside goroutines should help with any hanging current compaction counts.
Modify currentCompactionN to be a sync atomic.
Adding a debug level log within Compaction.Wait() should aid in debugging.