pulsar icon indicating copy to clipboard operation
pulsar copied to clipboard

Broker OOM when the bookie is slowly to respond and AQ is smaller than WQ

Open zymap opened this issue 2 years ago • 9 comments

Describe the bug When there has a bookie is slowly to respond the add entry request, and the AQ is smaller than WQ, broker will OOM.

To Reproduce

  1. Change the bookkeeper server code to make the add request to delay some times to handle
  2. Build the bookkeeper code and replace the version in Pulsar
  3. Set the managedLedgerDefaultWriteQuorum bigger than managedLedgerDefaultAckQuorum in the broker.conf
  4. Setup the cluster
  5. Produce message with pulsar-perf
  6. See the broker OOM

Expected behavior Broker should not OOM

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • macOS 12.1

Additional context Broker version: master BookKeeper version: branch-4.14 latest Ensemble: 2, AQ: 1, WQ:2

broker logs:

2022-03-25T12:29:35,600+0800 [pulsar-io-4-3] INFO  com.scurrilous.circe.checksum.Crc32cIntChecksum - SSE4.2 CRC32C provider initialized
2022-03-25T12:29:35,684+0800 [pulsar-io-4-1] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x8611e44b, L:/192.168.3.2:50153 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-2] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x3e9bfafd, L:/192.168.3.2:50150 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-3] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x9cc7b134, L:/192.168.3.2:50157 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-21] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-9] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x5d909697, L:/192.168.3.2:50162 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-19] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xb9b91a40, L:/192.168.3.2:50158 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-13] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xf9a8b922, L:/192.168.3.2:50165 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-10] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xcc74cde3, L:/192.168.3.2:50164 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,686+0800 [pulsar-io-4-24] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x9e0b4117, L:/192.168.3.2:50156 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-7] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xfe1340cf, L:/192.168.3.2:50155 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-15] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x042ab445, L:/192.168.3.2:50161 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,686+0800 [pulsar-io-4-17] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x835f7112, L:/192.168.3.2:50154 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-16] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x3f7eeb8a, L:/192.168.3.2:50147 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-3] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x9cc7b134, L:/192.168.3.2:50157 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-17] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x835f7112, L:/192.168.3.2:50154 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-16] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x3f7eeb8a, L:/192.168.3.2:50147 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-4] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x2ee9ef18, L:/192.168.3.2:50159 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-7] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xfe1340cf, L:/192.168.3.2:50155 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,685+0800 [pulsar-io-4-12] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x04f86f55, L:/192.168.3.2:50169 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-4] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x2ee9ef18, L:/192.168.3.2:50159 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-15] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x042ab445, L:/192.168.3.2:50161 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-10] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xcc74cde3, L:/192.168.3.2:50164 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,684+0800 [pulsar-io-4-6] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xd81f675b, L:/192.168.3.2:50168 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-11] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xb1bfb710, L:/192.168.3.2:50148 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-2] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x3e9bfafd, L:/192.168.3.2:50150 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-6] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xd81f675b, L:/192.168.3.2:50168 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-5] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x6659642b, L:/192.168.3.2:50167 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-9] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x5d909697, L:/192.168.3.2:50162 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-23] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x4640bd95, L:/192.168.3.2:50151 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-23] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x4640bd95, L:/192.168.3.2:50151 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-24] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x9e0b4117, L:/192.168.3.2:50156 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-1] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x8611e44b, L:/192.168.3.2:50153 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-18] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x155b4478, L:/192.168.3.2:50166 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,689+0800 [pulsar-io-4-18] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x155b4478, L:/192.168.3.2:50166 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-7] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x8c527e39, L:/192.168.3.2:50175 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-8] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xeacf4e8e, L:/192.168.3.2:50170 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,689+0800 [pulsar-io-4-7] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x8c527e39, L:/192.168.3.2:50175 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-6] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x6af6cc28, L:/192.168.3.2:50176 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,690+0800 [pulsar-io-4-6] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x6af6cc28, L:/192.168.3.2:50176 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,690+0800 [pulsar-io-4-8] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xeacf4e8e, L:/192.168.3.2:50170 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,690+0800 [pulsar-io-4-8] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xb5917759, L:/192.168.3.2:50172 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-13] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xf9a8b922, L:/192.168.3.2:50165 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-12] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x04f86f55, L:/192.168.3.2:50169 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,691+0800 [pulsar-io-4-10] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x09c34093, L:/192.168.3.2:50173 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,686+0800 [pulsar-io-4-22] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xde543d90, L:/192.168.3.2:50149 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,691+0800 [pulsar-io-4-10] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x09c34093, L:/192.168.3.2:50173 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,691+0800 [pulsar-io-4-22] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xde543d90, L:/192.168.3.2:50149 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,690+0800 [pulsar-io-4-8] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xb5917759, L:/192.168.3.2:50172 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,691+0800 [pulsar-io-4-12] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xfc4f4711, L:/192.168.3.2:50177 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,692+0800 [pulsar-io-4-12] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xfc4f4711, L:/192.168.3.2:50177 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-9] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xbd3c79ca, L:/192.168.3.2:50174 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-14] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xc40c5fba, L:/192.168.3.2:50160 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,692+0800 [pulsar-io-4-14] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xc40c5fba, L:/192.168.3.2:50160 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,692+0800 [pulsar-io-4-9] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xbd3c79ca, L:/192.168.3.2:50174 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-19] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xb9b91a40, L:/192.168.3.2:50158 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-20] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xfbb35d39, L:/192.168.3.2:50163 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-11] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xb1bfb710, L:/192.168.3.2:50148 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,694+0800 [pulsar-io-4-20] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xfbb35d39, L:/192.168.3.2:50163 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,695+0800 [pulsar-io-4-11] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x518a2724, L:/192.168.3.2:50171 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,695+0800 [pulsar-io-4-11] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x518a2724, L:/192.168.3.2:50171 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,697+0800 [pulsar-io-4-5] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x6659642b, L:/192.168.3.2:50167 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,697+0800 [pulsar-io-4-21] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,764+0800 [pulsar-io-4-5] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x60410eac, L:/192.168.3.2:50178 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,764+0800 [pulsar-io-4-5] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x60410eac, L:/192.168.3.2:50178 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:46,587+0800 [pulsar-load-manager-1-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 13.203266263008118% exceeded threshold 10%; time since last report written is 14.999 seconds
2022-03-25T12:29:46,593+0800 [main-EventThread] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:01,587+0800 [pulsar-load-manager-1-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 12.288329005241394% exceeded threshold 10%; time since last report written is 14.999 seconds
2022-03-25T12:30:01,604+0800 [main-EventThread] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:06,531+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 583 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:06,538+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3181>, <Bookie:192.168.3.2:3182>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], fallback to choose bookie randomly from the cluster.
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3181>, <Bookie:192.168.3.2:3182>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], fallback to choose bookie randomly from the cluster.
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.MetadataUpdateLoop - UpdateLoop(ledgerId=0,loopId=38f2cbec) Exception updating
org.apache.bookkeeper.client.BKException$BKNotEnoughBookiesException: Not enough non-faulty bookies available
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandomInternal(RackawareEnsemblePlacementPolicyImpl.java:780) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandom(RackawareEnsemblePlacementPolicyImpl.java:698) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:587) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:206) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:547) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:227) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:475) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:334) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.lambda$ensembleChangeLoop$2(LedgerHandle.java:1935) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.MetadataUpdateLoop.writeLoop(MetadataUpdateLoop.java:134) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.MetadataUpdateLoop.run(MetadataUpdateLoop.java:123) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.ensembleChangeLoop(LedgerHandle.java:1954) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.handleBookieFailure(LedgerHandle.java:1903) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.maybeHandleDelayedWriteBookieFailure(LedgerHandle.java:1860) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.PendingAddOp.safeRun(PendingAddOp.java:271) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2022-03-25T12:30:06,547+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  org.apache.bookkeeper.client.LedgerHandle - [EnsembleChange(ledger:0, change-id:0000000001)][attempt:1] Exception changing ensemble
org.apache.bookkeeper.client.BKException$BKNotEnoughBookiesException: Not enough non-faulty bookies available
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandomInternal(RackawareEnsemblePlacementPolicyImpl.java:780) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandom(RackawareEnsemblePlacementPolicyImpl.java:698) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:587) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:206) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:547) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:227) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:475) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:334) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.lambda$ensembleChangeLoop$2(LedgerHandle.java:1935) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.MetadataUpdateLoop.writeLoop(MetadataUpdateLoop.java:134) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.MetadataUpdateLoop.run(MetadataUpdateLoop.java:123) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.ensembleChangeLoop(LedgerHandle.java:1954) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.handleBookieFailure(LedgerHandle.java:1903) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.LedgerHandle.maybeHandleDelayedWriteBookieFailure(LedgerHandle.java:1860) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.client.PendingAddOp.safeRun(PendingAddOp.java:271) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) ~[org.apache.bookkeeper-bookkeeper-common-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2022-03-25T12:30:06,548+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.LedgerHandle - Closing ledger 0 due to NotEnoughBookiesException: Not enough non-faulty bookies available
2022-03-25T12:30:06,550+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20459
2022-03-25T12:30:06,551+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20460
2022-03-25T12:30:06,551+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20461
2022-03-25T12:30:06,551+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20462
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20463
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20464
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20465
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20466
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20467
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20468
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20469
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20470
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20471
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20472
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20473
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20474
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/example] Creating a new ledger
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20475
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20476
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20477
2022-03-25T12:30:06,567+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/example] Creating ledger, metadata: {component=[109, 97, 110, 97, 103, 101, 100, 45, 108, 101, 100, 103, 101, 114], pulsar/managed-ledger=[112, 117, 98, 108, 105, 99, 47, 100, 101, 102, 97, 117, 108, 116, 47, 112, 101, 114, 115, 105, 115, 116, 101, 110, 116, 47, 101, 120, 97, 109, 112, 108, 101], application=[112, 117, 108, 115, 97, 114]} - metadata ops timeout : 60 seconds
2022-03-25T12:30:06,568+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] WARN  org.apache.bookkeeper.client.BookieWatcherImpl - New ensemble: [192.168.3.2:3182, 192.168.3.2:3181] is not adhering to Placement Policy. quarantinedBookies: []
2022-03-25T12:30:06,578+0800 [main-EventThread] INFO  org.apache.bookkeeper.client.LedgerCreateOp - Ensemble: [192.168.3.2:3182, 192.168.3.2:3181] for ledger: 1
2022-03-25T12:30:06,578+0800 [main-EventThread] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/example] Created new ledger 1
2022-03-25T12:30:07,531+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 687 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:08,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 690 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:09,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 666 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:10,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 663 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:11,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 701 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:11,586+0800 [pulsar-load-manager-1-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 12.5% exceeded threshold 10%; time since last report written is 9.998 seconds
2022-03-25T12:30:11,593+0800 [main-EventThread] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:12,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 604 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:13,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 643 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:14,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 673 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:15,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 638 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:16,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 633 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:16,586+0800 [pulsar-load-manager-1-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 12.5% exceeded threshold 10%; time since last report written is 5.0 seconds
2022-03-25T12:30:16,634+0800 [main-EventThread] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:17,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 632 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:18,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 636 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:19,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 638 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:20,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 635 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:21,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 638 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:22,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 635 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:23,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 636 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:23,547+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] WARN  org.apache.bookkeeper.client.BookieWatcherImpl - Bookie 192.168.3.2:3181 has been quarantined because of read/write errors.
2022-03-25T12:30:24,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 635 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:25,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 636 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:26,429+0800 [pulsar-load-manager-1-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Only 1 broker available: no load shedding will be performed
2022-03-25T12:30:26,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 633 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:26,587+0800 [pulsar-load-manager-1-1] INFO  org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change Infinity% exceeded threshold 10%; time since last report written is 10.001 seconds
2022-03-25T12:30:26,756+0800 [main-EventThread] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:26,760+0800 [pulsar-modular-load-manager-28-1] INFO  org.apache.pulsar.broker.loadbalance.BundleSplitStrategy - The count of topics on the bundle public/default/0xa0000000_0xb0000000 is less than 2,skip split!
2022-03-25T12:30:27,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 605 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:28,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 655 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:29,087+0800 [pulsar-io-4-3] ERROR org.apache.pulsar.PulsarBrokerStarter - -- Shutting down - Received OOM exception: failed to allocate 16777216 byte(s) of direct memory (used: 268435456, max: 268435456)
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 268435456, max: 268435456)
	at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:802) ~[io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:731) ~[io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:648) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:623) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:202) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:186) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.PoolArena.allocate(PoolArena.java:136) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.PoolArena.allocate(PoolArena.java:126) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:394) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:163) [org.apache.bookkeeper-bookkeeper-common-allocator-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:157) [org.apache.bookkeeper-bookkeeper-common-allocator-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) [io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179) [io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:140) [io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:120) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:150) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2022-03-25T12:30:29,111+0800 [pulsar-io-4-3] INFO  org.apache.pulsar.broker.PulsarService - Invoking Pulsar service immediate shutdown
2022-03-25T12:30:29,118+0800 [metadata-store-9-1] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /namespace/pulsar/test/192.168.3.2:8080/0x00000000_0xffffffff was invalidated
2022-03-25T12:30:29,119+0800 [metadata-store-9-1] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /namespace/public/default/0xa0000000_0xb0000000 was invalidated
2022-03-25T12:30:29,120+0800 [metadata-store-9-1] WARN  org.apache.pulsar.metadata.coordination.impl.LeaderElectionImpl - Leadership released for /loadbalance/leader
2022-03-25T12:30:29,121+0800 [metadata-store-9-1] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /loadbalance/brokers/192.168.3.2:8080 was invalidated
2022-03-25T12:30:29,122+0800 [metadata-store-9-1] INFO  org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /namespace/pulsar/192.168.3.2:8080/0x00000000_0xffffffff was invalidated
2022-03-25T12:30:29,226+0800 [main-EventThread] INFO  org.apache.pulsar.metadata.impl.ZKSessionWatcher - Got ZK session watch event: WatchedEvent state:Closed type:None path:null

log file: oom.log

zymap avatar Mar 25 '22 06:03 zymap

https://github.com/apache/bookkeeper/pull/3139 Here has a PR proposal to resolve this issue.

zymap avatar Mar 25 '22 06:03 zymap

The issue had no activity for 30 days, mark with Stale label.

github-actions[bot] avatar Apr 25 '22 02:04 github-actions[bot]

@zymap This seems to be related to #10439 .

lhotari avatar May 11 '22 04:05 lhotari

Yes

zymap avatar May 11 '22 07:05 zymap

The issue had no activity for 30 days, mark with Stale label.

github-actions[bot] avatar Jun 11 '22 02:06 github-actions[bot]

@zymap hi! We also encountered the same problem. The broker has an OutOfDirectMemoryError, but after I set the waitTimeoutOnBackpressureMs=50 parameter, I have seen that the backpressure has taken effect, but the broker has an OutOfMemoryError: Java heap space. What other parameters can control the memory usage?

Dump analysis found that the top5 classes occupying memory space are:

io.netty.channel.ChannelOutboundBuffer$Entry io.netty.channel.ChannelOutboundBuffer io.netty.channel.epoll.EpollSocketChannel$EpollSocketChannelUnsafe io.netty.buffer.PooledDuplicatedByteBuf io.netty.buffer.PooledUnsafeDirectByteBuf

keyboardbobo avatar Aug 05 '22 08:08 keyboardbobo

@keyboardbobo That also needs some bookie side configurations. Such as maxAddsInProgressLimit: "10", maxReadsInProgressLimit: "10", waitTimeoutOnResponseBackpressureMs: "5". By default, the pulsar uses a v2 request to the bookie, there haven't the request limits on the server side. We apply this through this PR https://github.com/apache/bookkeeper/pull/3324.

We are releasing the bookie 4.15.1 which includes that.

zymap avatar Aug 08 '22 01:08 zymap

@zymap We only have three brokers and three bookies in the test environment. After the back pressure is turned on, a lot of BKNotEnoughBookiesException: Not enough non-faulty bookies available and more Spent 8 ms waiting for 3 writable channels appear from time to time (I think this is can be used as debug log), is there any way to avoid it?

keyboardbobo avatar Aug 08 '22 12:08 keyboardbobo

If you configured the write quorum as 3, I'd suggest you increase the bookie node to 4 or more. After configuring the back pressure, it should be normal to have that exception. Because once a bookie is slow, the bookie will not available for writing and it needs to find another bookie to write. If you only have 3 bookies. I'd suggest you to configure the WQ and AQ as 2 to avoid this.

zymap avatar Aug 09 '22 04:08 zymap

@zymap Is there a resolution for this issue? Would it be possible to add documentation to Pulsar for describing the way to tune Pulsar and Bookkeeper properly?

lhotari avatar Sep 27 '22 05:09 lhotari

If a bookie is slow, if enough bookies are available, will the broker automatically write to another bookie? It sounds like the workaround is to have (total bookies + N) > WQ where N depends on the load.

devinbost avatar Apr 06 '23 17:04 devinbost