pulsar
pulsar copied to clipboard
Broker OOM when the bookie is slowly to respond and AQ is smaller than WQ
Describe the bug When there has a bookie is slowly to respond the add entry request, and the AQ is smaller than WQ, broker will OOM.
To Reproduce
- Change the bookkeeper server code to make the add request to delay some times to handle
- Build the bookkeeper code and replace the version in Pulsar
- Set the managedLedgerDefaultWriteQuorum bigger than managedLedgerDefaultAckQuorum in the broker.conf
- Setup the cluster
- Produce message with pulsar-perf
- See the broker OOM
Expected behavior Broker should not OOM
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- macOS 12.1
Additional context Broker version: master BookKeeper version: branch-4.14 latest Ensemble: 2, AQ: 1, WQ:2
broker logs:
2022-03-25T12:29:35,600+0800 [pulsar-io-4-3] INFO com.scurrilous.circe.checksum.Crc32cIntChecksum - SSE4.2 CRC32C provider initialized
2022-03-25T12:29:35,684+0800 [pulsar-io-4-1] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x8611e44b, L:/192.168.3.2:50153 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-2] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x3e9bfafd, L:/192.168.3.2:50150 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-3] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x9cc7b134, L:/192.168.3.2:50157 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-21] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-9] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x5d909697, L:/192.168.3.2:50162 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-19] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xb9b91a40, L:/192.168.3.2:50158 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-13] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xf9a8b922, L:/192.168.3.2:50165 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-10] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xcc74cde3, L:/192.168.3.2:50164 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,686+0800 [pulsar-io-4-24] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x9e0b4117, L:/192.168.3.2:50156 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-7] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xfe1340cf, L:/192.168.3.2:50155 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,684+0800 [pulsar-io-4-15] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x042ab445, L:/192.168.3.2:50161 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,686+0800 [pulsar-io-4-17] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x835f7112, L:/192.168.3.2:50154 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-16] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x3f7eeb8a, L:/192.168.3.2:50147 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-3] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x9cc7b134, L:/192.168.3.2:50157 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-17] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x835f7112, L:/192.168.3.2:50154 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-16] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x3f7eeb8a, L:/192.168.3.2:50147 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-4] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x2ee9ef18, L:/192.168.3.2:50159 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-7] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xfe1340cf, L:/192.168.3.2:50155 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,685+0800 [pulsar-io-4-12] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x04f86f55, L:/192.168.3.2:50169 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-4] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x2ee9ef18, L:/192.168.3.2:50159 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-15] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x042ab445, L:/192.168.3.2:50161 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-10] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xcc74cde3, L:/192.168.3.2:50164 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,684+0800 [pulsar-io-4-6] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xd81f675b, L:/192.168.3.2:50168 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-11] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xb1bfb710, L:/192.168.3.2:50148 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-2] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x3e9bfafd, L:/192.168.3.2:50150 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-6] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xd81f675b, L:/192.168.3.2:50168 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-5] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x6659642b, L:/192.168.3.2:50167 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-9] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x5d909697, L:/192.168.3.2:50162 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-23] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x4640bd95, L:/192.168.3.2:50151 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-23] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x4640bd95, L:/192.168.3.2:50151 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-24] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x9e0b4117, L:/192.168.3.2:50156 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-1] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x8611e44b, L:/192.168.3.2:50153 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-18] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0x155b4478, L:/192.168.3.2:50166 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,689+0800 [pulsar-io-4-18] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x155b4478, L:/192.168.3.2:50166 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-7] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x8c527e39, L:/192.168.3.2:50175 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,685+0800 [pulsar-io-4-8] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xeacf4e8e, L:/192.168.3.2:50170 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,689+0800 [pulsar-io-4-7] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x8c527e39, L:/192.168.3.2:50175 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-6] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x6af6cc28, L:/192.168.3.2:50176 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,690+0800 [pulsar-io-4-6] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x6af6cc28, L:/192.168.3.2:50176 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,690+0800 [pulsar-io-4-8] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xeacf4e8e, L:/192.168.3.2:50170 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,690+0800 [pulsar-io-4-8] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xb5917759, L:/192.168.3.2:50172 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,687+0800 [pulsar-io-4-13] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xf9a8b922, L:/192.168.3.2:50165 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,687+0800 [pulsar-io-4-12] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x04f86f55, L:/192.168.3.2:50169 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,691+0800 [pulsar-io-4-10] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x09c34093, L:/192.168.3.2:50173 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,686+0800 [pulsar-io-4-22] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xde543d90, L:/192.168.3.2:50149 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,691+0800 [pulsar-io-4-10] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x09c34093, L:/192.168.3.2:50173 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,691+0800 [pulsar-io-4-22] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xde543d90, L:/192.168.3.2:50149 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,690+0800 [pulsar-io-4-8] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xb5917759, L:/192.168.3.2:50172 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,691+0800 [pulsar-io-4-12] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xfc4f4711, L:/192.168.3.2:50177 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,692+0800 [pulsar-io-4-12] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xfc4f4711, L:/192.168.3.2:50177 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,689+0800 [pulsar-io-4-9] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0xbd3c79ca, L:/192.168.3.2:50174 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-14] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xc40c5fba, L:/192.168.3.2:50160 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,692+0800 [pulsar-io-4-14] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xc40c5fba, L:/192.168.3.2:50160 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,692+0800 [pulsar-io-4-9] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xbd3c79ca, L:/192.168.3.2:50174 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-19] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xb9b91a40, L:/192.168.3.2:50158 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,688+0800 [pulsar-io-4-20] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3182 [id: 0xfbb35d39, L:/192.168.3.2:50163 - R:/192.168.3.2:3182]
2022-03-25T12:29:35,688+0800 [pulsar-io-4-11] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xb1bfb710, L:/192.168.3.2:50148 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,694+0800 [pulsar-io-4-20] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xfbb35d39, L:/192.168.3.2:50163 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,695+0800 [pulsar-io-4-11] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x518a2724, L:/192.168.3.2:50171 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,695+0800 [pulsar-io-4-11] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x518a2724, L:/192.168.3.2:50171 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,697+0800 [pulsar-io-4-5] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x6659642b, L:/192.168.3.2:50167 - R:/192.168.3.2:3182] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,697+0800 [pulsar-io-4-21] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:35,764+0800 [pulsar-io-4-5] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Successfully connected to bookie: 192.168.3.2:3181 [id: 0x60410eac, L:/192.168.3.2:50178 - R:/192.168.3.2:3181]
2022-03-25T12:29:35,764+0800 [pulsar-io-4-5] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - connection [id: 0x60410eac, L:/192.168.3.2:50178 - R:/192.168.3.2:3181] authenticated as BookKeeperPrincipal{ANONYMOUS}
2022-03-25T12:29:46,587+0800 [pulsar-load-manager-1-1] INFO org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 13.203266263008118% exceeded threshold 10%; time since last report written is 14.999 seconds
2022-03-25T12:29:46,593+0800 [main-EventThread] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:01,587+0800 [pulsar-load-manager-1-1] INFO org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 12.288329005241394% exceeded threshold 10%; time since last report written is 14.999 seconds
2022-03-25T12:30:01,604+0800 [main-EventThread] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:06,531+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 583 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:06,538+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3181>, <Bookie:192.168.3.2:3182>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], fallback to choose bookie randomly from the cluster.
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3181>, <Bookie:192.168.3.2:3182>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], fallback to choose bookie randomly from the cluster.
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>], allBookies [<Bookie:192.168.3.2:3182>, <Bookie:192.168.3.2:3181>].
2022-03-25T12:30:06,539+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.MetadataUpdateLoop - UpdateLoop(ledgerId=0,loopId=38f2cbec) Exception updating
org.apache.bookkeeper.client.BKException$BKNotEnoughBookiesException: Not enough non-faulty bookies available
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandomInternal(RackawareEnsemblePlacementPolicyImpl.java:780) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandom(RackawareEnsemblePlacementPolicyImpl.java:698) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:587) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:206) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:547) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:227) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:475) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:334) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.lambda$ensembleChangeLoop$2(LedgerHandle.java:1935) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.MetadataUpdateLoop.writeLoop(MetadataUpdateLoop.java:134) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.MetadataUpdateLoop.run(MetadataUpdateLoop.java:123) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.ensembleChangeLoop(LedgerHandle.java:1954) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.handleBookieFailure(LedgerHandle.java:1903) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.maybeHandleDelayedWriteBookieFailure(LedgerHandle.java:1860) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.PendingAddOp.safeRun(PendingAddOp.java:271) [org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2022-03-25T12:30:06,547+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN org.apache.bookkeeper.client.LedgerHandle - [EnsembleChange(ledger:0, change-id:0000000001)][attempt:1] Exception changing ensemble
org.apache.bookkeeper.client.BKException$BKNotEnoughBookiesException: Not enough non-faulty bookies available
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandomInternal(RackawareEnsemblePlacementPolicyImpl.java:780) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectRandom(RackawareEnsemblePlacementPolicyImpl.java:698) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:587) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:206) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.selectFromNetworkLocation(RackawareEnsemblePlacementPolicyImpl.java:547) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.selectFromNetworkLocation(RackawareEnsemblePlacementPolicy.java:227) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.replaceBookie(RackawareEnsemblePlacementPolicyImpl.java:475) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.replaceBookie(RackawareEnsemblePlacementPolicy.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.BookieWatcherImpl.replaceBookie(BookieWatcherImpl.java:334) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.EnsembleUtils.replaceBookiesInEnsemble(EnsembleUtils.java:71) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.lambda$ensembleChangeLoop$2(LedgerHandle.java:1935) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.MetadataUpdateLoop.writeLoop(MetadataUpdateLoop.java:134) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.MetadataUpdateLoop.run(MetadataUpdateLoop.java:123) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.ensembleChangeLoop(LedgerHandle.java:1954) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.handleBookieFailure(LedgerHandle.java:1903) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.LedgerHandle.maybeHandleDelayedWriteBookieFailure(LedgerHandle.java:1860) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.client.PendingAddOp.safeRun(PendingAddOp.java:271) ~[org.apache.bookkeeper-bookkeeper-server-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) ~[org.apache.bookkeeper-bookkeeper-common-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2022-03-25T12:30:06,548+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.LedgerHandle - Closing ledger 0 due to NotEnoughBookiesException: Not enough non-faulty bookies available
2022-03-25T12:30:06,550+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20459
2022-03-25T12:30:06,551+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20460
2022-03-25T12:30:06,551+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20461
2022-03-25T12:30:06,551+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20462
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20463
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20464
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20465
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20466
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20467
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20468
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20469
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20470
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20471
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20472
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20473
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20474
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/example] Creating a new ledger
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20475
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20476
2022-03-25T12:30:06,552+0800 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR org.apache.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: L0 E20477
2022-03-25T12:30:06,567+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/example] Creating ledger, metadata: {component=[109, 97, 110, 97, 103, 101, 100, 45, 108, 101, 100, 103, 101, 114], pulsar/managed-ledger=[112, 117, 98, 108, 105, 99, 47, 100, 101, 102, 97, 117, 108, 116, 47, 112, 101, 114, 115, 105, 115, 116, 101, 110, 116, 47, 101, 120, 97, 109, 112, 108, 101], application=[112, 117, 108, 115, 97, 114]} - metadata ops timeout : 60 seconds
2022-03-25T12:30:06,568+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] WARN org.apache.bookkeeper.client.BookieWatcherImpl - New ensemble: [192.168.3.2:3182, 192.168.3.2:3181] is not adhering to Placement Policy. quarantinedBookies: []
2022-03-25T12:30:06,578+0800 [main-EventThread] INFO org.apache.bookkeeper.client.LedgerCreateOp - Ensemble: [192.168.3.2:3182, 192.168.3.2:3181] for ledger: 1
2022-03-25T12:30:06,578+0800 [main-EventThread] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [public/default/persistent/example] Created new ledger 1
2022-03-25T12:30:07,531+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 687 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:08,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 690 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:09,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 666 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:10,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 663 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:11,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 701 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:11,586+0800 [pulsar-load-manager-1-1] INFO org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 12.5% exceeded threshold 10%; time since last report written is 9.998 seconds
2022-03-25T12:30:11,593+0800 [main-EventThread] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:12,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 604 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:13,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 643 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:14,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 673 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:15,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 638 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:16,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 633 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:16,586+0800 [pulsar-load-manager-1-1] INFO org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change 12.5% exceeded threshold 10%; time since last report written is 5.0 seconds
2022-03-25T12:30:16,634+0800 [main-EventThread] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:17,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 632 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:18,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 636 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:19,530+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 638 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:20,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 635 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:21,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 638 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:22,529+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 635 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:23,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 636 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:23,547+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] WARN org.apache.bookkeeper.client.BookieWatcherImpl - Bookie 192.168.3.2:3181 has been quarantined because of read/write errors.
2022-03-25T12:30:24,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 635 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:25,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 636 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:26,429+0800 [pulsar-load-manager-1-1] INFO org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Only 1 broker available: no load shedding will be performed
2022-03-25T12:30:26,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 633 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:26,587+0800 [pulsar-load-manager-1-1] INFO org.apache.pulsar.broker.loadbalance.impl.ModularLoadManagerImpl - Writing local data to metadata store because maximum change Infinity% exceeded threshold 10%; time since last report written is 10.001 seconds
2022-03-25T12:30:26,756+0800 [main-EventThread] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Acquired resource lock on /loadbalance/brokers/192.168.3.2:8080
2022-03-25T12:30:26,760+0800 [pulsar-modular-load-manager-28-1] INFO org.apache.pulsar.broker.loadbalance.BundleSplitStrategy - The count of topics on the bundle public/default/0xa0000000_0xb0000000 is less than 2,skip split!
2022-03-25T12:30:27,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 605 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:28,528+0800 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO org.apache.bookkeeper.proto.PerChannelBookieClient - Timed-out 655 operations to channel [id: 0xc663bf3d, L:/192.168.3.2:50152 - R:/192.168.3.2:3181] for 192.168.3.2:3181
2022-03-25T12:30:29,087+0800 [pulsar-io-4-3] ERROR org.apache.pulsar.PulsarBrokerStarter - -- Shutting down - Received OOM exception: failed to allocate 16777216 byte(s) of direct memory (used: 268435456, max: 268435456)
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 268435456, max: 268435456)
at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:802) ~[io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:731) ~[io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:648) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:623) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:202) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:186) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:136) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:126) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:394) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) ~[io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:163) [org.apache.bookkeeper-bookkeeper-common-allocator-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at org.apache.bookkeeper.common.allocator.impl.ByteBufAllocatorImpl.newDirectBuffer(ByteBufAllocatorImpl.java:157) [org.apache.bookkeeper-bookkeeper-common-allocator-4.14.3-SNAPSHOT.jar:4.14.3-SNAPSHOT]
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) [io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179) [io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:140) [io.netty-netty-buffer-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:120) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:150) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [io.netty-netty-transport-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.74.Final.jar:4.1.74.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2022-03-25T12:30:29,111+0800 [pulsar-io-4-3] INFO org.apache.pulsar.broker.PulsarService - Invoking Pulsar service immediate shutdown
2022-03-25T12:30:29,118+0800 [metadata-store-9-1] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /namespace/pulsar/test/192.168.3.2:8080/0x00000000_0xffffffff was invalidated
2022-03-25T12:30:29,119+0800 [metadata-store-9-1] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /namespace/public/default/0xa0000000_0xb0000000 was invalidated
2022-03-25T12:30:29,120+0800 [metadata-store-9-1] WARN org.apache.pulsar.metadata.coordination.impl.LeaderElectionImpl - Leadership released for /loadbalance/leader
2022-03-25T12:30:29,121+0800 [metadata-store-9-1] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /loadbalance/brokers/192.168.3.2:8080 was invalidated
2022-03-25T12:30:29,122+0800 [metadata-store-9-1] INFO org.apache.pulsar.metadata.coordination.impl.ResourceLockImpl - Lock on resource /namespace/pulsar/192.168.3.2:8080/0x00000000_0xffffffff was invalidated
2022-03-25T12:30:29,226+0800 [main-EventThread] INFO org.apache.pulsar.metadata.impl.ZKSessionWatcher - Got ZK session watch event: WatchedEvent state:Closed type:None path:null
log file: oom.log
https://github.com/apache/bookkeeper/pull/3139 Here has a PR proposal to resolve this issue.
The issue had no activity for 30 days, mark with Stale label.
@zymap This seems to be related to #10439 .
Yes
The issue had no activity for 30 days, mark with Stale label.
@zymap hi! We also encountered the same problem. The broker has an OutOfDirectMemoryError
, but after I set the waitTimeoutOnBackpressureMs=50
parameter, I have seen that the backpressure has taken effect, but the broker has an OutOfMemoryError: Java heap space
. What other parameters can control the memory usage?
Dump analysis found that the top5 classes occupying memory space are:
io.netty.channel.ChannelOutboundBuffer$Entry io.netty.channel.ChannelOutboundBuffer io.netty.channel.epoll.EpollSocketChannel$EpollSocketChannelUnsafe io.netty.buffer.PooledDuplicatedByteBuf io.netty.buffer.PooledUnsafeDirectByteBuf
@keyboardbobo That also needs some bookie side configurations. Such as maxAddsInProgressLimit: "10", maxReadsInProgressLimit: "10", waitTimeoutOnResponseBackpressureMs: "5". By default, the pulsar uses a v2 request to the bookie, there haven't the request limits on the server side. We apply this through this PR https://github.com/apache/bookkeeper/pull/3324.
We are releasing the bookie 4.15.1 which includes that.
@zymap We only have three brokers and three bookies in the test environment. After the back pressure is turned on, a lot of BKNotEnoughBookiesException: Not enough non-faulty bookies available
and more Spent 8 ms waiting for 3 writable channels
appear from time to time (I think this is can be used as debug log), is there any way to avoid it?
If you configured the write quorum as 3, I'd suggest you increase the bookie node to 4 or more. After configuring the back pressure, it should be normal to have that exception. Because once a bookie is slow, the bookie will not available for writing and it needs to find another bookie to write. If you only have 3 bookies. I'd suggest you to configure the WQ and AQ as 2 to avoid this.
@zymap Is there a resolution for this issue? Would it be possible to add documentation to Pulsar for describing the way to tune Pulsar and Bookkeeper properly?
If a bookie is slow, if enough bookies are available, will the broker automatically write to another bookie? It sounds like the workaround is to have (total bookies + N) > WQ where N depends on the load.