zookeeper icon indicating copy to clipboard operation
zookeeper copied to clipboard

ZOOKEEPER-3425: ranking the ttl by expireTime asc for the performance

Open maoling opened this issue 5 years ago • 14 comments

  • Ranking the ttl by expireTime asc will get the following two benefits(Overall using space to swap time):

    • the guarantee of timing order, t1 will disappear before the t2 when t1's expired timestamp < t2's in the same round.
    • avoid the inefficient search in the getCandidates(). If we have 1000+(even the 10000+) ttl nodes which will disappear two hours later, every iteration it will check whether all the ttls node has expired. With ranking, it will break/jump quickly.
    • this design was inspired by the hadoop's client LeaseManager:
  • A UT:testTTLTreeSetCollection was added to assert the behaviour of the TreeSet Collection:ttls.

  • BTW, let's revisit the TimeTask's scheduleAtFixedRate strategy. the TimeTask will run every minute(with the defalut checkIntervalMs value). when the time current round spends is more than one minute, it will waiting for the current round being finished, then starting another round. That's OK.

  • more details in the ZOOKEEPER-3425

maoling avatar Jul 04 '19 09:07 maoling

@maoling I like the improvement part: sorting TTLs by expiry time makes sense to me, but why do you want to disable children creation?

@Randgalt You might want to take a look at this patch.

anmolnar avatar Nov 18 '19 13:11 anmolnar

TTL nodes must be able to have children. I'm -1 on this part of the PR - it's a non-starter TBH and completely changes what a TTL node is and how it's used.

Randgalt avatar Nov 18 '19 14:11 Randgalt

Heavy -1 on this. This would break users such as Elastic search cloud. TTL nodes were designed to have children.

Randgalt avatar Nov 18 '19 14:11 Randgalt

@anmolnar @Randgalt Yep, we should not break the compatibility, I have deleted the changes about the semantics of TTL and focus on the performance issue by a force update. PTAL.

maoling avatar Nov 19 '19 02:11 maoling

Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build-maven/1648/

asf-ci avatar Nov 23 '19 09:11 asf-ci

Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build-maven/1649/

asf-ci avatar Nov 23 '19 10:11 asf-ci

Here's an alternate implementation that's simpler (in my view). However, I need to think more about the maintenance of the two maps. I'm not certain this is thread safe. Finally, is this optimization needed? This smells like a premature optimization to me. Has anyone seen issues with this in production installations? The TTLs are searched in a background thread and shouldn't affect foreground performance very much.

https://github.com/apache/zookeeper/compare/master...Randgalt:ZOOKEEPER-3425-order-ttls-for-performance

update: I've verified the thread safety of the two maps and updated the code to make sure it's properly synchronized.

Randgalt avatar Nov 23 '19 15:11 Randgalt

This new TTLManager solves thread safety however it hurts concurrency. In the master implementation, the TTL map is concurrent (via the ConcurrentMap) and allows for multi-thread mutation. This class synchronizes the entire TTL map making access serial for all the threads.

I'm, in general, still highly -1 on this change as it feels like a premature optimization.

Randgalt avatar Nov 25 '19 02:11 Randgalt

Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build-maven/1651/

asf-ci avatar Nov 25 '19 03:11 asf-ci

@Randgalt

  • Great review work and insight. I take a close look at the patch you provide, learn a lot from it.
  • I don't insist on this improvement, not sure whether it's a premature optimization. But I believe this PR has opened an interesting discussion :). I really like this atmosphere.

maoling avatar Nov 25 '19 07:11 maoling

If this patch is purely performance optimisation (I absolutely support that a single patch should focus on a single thing at a time), I believe we should see some test cases and numbers to prove the benefits.

anmolnar avatar Dec 07 '19 20:12 anmolnar

I second @Randgalt 's concern: I don't like that we have to synchronise on entire TTLManager. Hence I'd like to see some perf tests on different workloads: I suspect this implementation performs worse with clients that create / delete lots of TTLs.

anmolnar avatar Dec 07 '19 21:12 anmolnar

On the other hand - and I'm not sure about this -, but if writes in ZooKeeper are serialised, do we need to care about synchronisation? Which threads are possibly accessing TTLManager at the same time?

anmolnar avatar Dec 07 '19 21:12 anmolnar

if writes in ZooKeeper are serialised, do we need to care about synchronisation?

I've wondered about that myself. The Watcher Manager has a bunch of synchronization (and other parts of the code too). TBH - I don't know enough about it and have always followed other synchronized blocks in the ZK code.

Randgalt avatar Dec 07 '19 21:12 Randgalt