twemproxy icon indicating copy to clipboard operation
twemproxy copied to clipboard

Run time configuration reload

Open vlm opened this issue 11 years ago • 38 comments

This patch introduces a runtime configuration reload. killall -USR1 nutcracker would cause a configuration reload.

Changes

  • The patch contains the test cases to test different failure scenarios. Unfortunately, to make the test work across platform (Linux, Mac OS X) I also had to fix a nc_kqueue bug which caused crashes on Mac OS X.
  • The patch switches separate .client, .redis bit fields in nc_connection.h to a more precise enumeration of what a connection is. This also allows significantly more descriptive logging.
  • The event loop in the original branch was crafted in tight coupling with struct conn, in a way that is not very conducive to allowing multiple different kinds of entities receiving file system events. This did not work well when I tried to safely pause the statistics gathering events. I restructured that part by removing the part of event loop which deals with statistics and moving it into the statistics thread itself.

Model of operation

The new config is read into memory, and the new pools are allocated. If the new pools cannot be allocated for some reasons, the configuration change is not done and the configuration state is rolled back. Once the new pools are allocated, we pause all the ingress data transfers from the clients and drain the output queues. Once the client queues are drained (and we have sent out all the outstanding server responses), we replace the old pools with the newly configured pools. Then we unblock the clients so they can send new requests.

I am ok restructuring this patch however you see fit. Looking forward to your feedback.

vlm avatar Feb 12 '15 05:02 vlm

+1

saa avatar Feb 13 '15 11:02 saa

Here's what happens when I add servers and run get x with telnet:

[.......................] signal 10 (SIGUSR1) received, config reload
/bin/nutcracker(nc_stacktrace_fd+0x17)[0x418027]
/bin/nutcracker(signal_handler+0xeb)[0x41574b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf0a0)[0x7fd0b79c20a0]
/bin/nutcracker(_stats_server_incr+0x25)[0x415595]
/bin/nutcracker(server_connected+0x2a)[0x40d93a]
/bin/nutcracker(req_send_next+0x65)[0x4113a5]
/bin/nutcracker(msg_send+0x24)[0x4105b4]
/bin/nutcracker(core_core+0x64)[0x40bcf4]
/bin/nutcracker(event_wait+0x11a)[0x42115a]
/bin/nutcracker(core_loop+0x37)[0x40c237]
/bin/nutcracker(main+0x5dc)[0x40b74c]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7fd0b73c4ead]
/bin/nutcracker[0x40bb09]
[.......................] signal 11 (SIGSEGV) received, core dumping

Config that crashed nutcracker:

memcached_one:
  listen: 127.0.0.1:12335
  hash: fnv1a_64
  distribution: ketama
  auto_eject_hosts: false
  servers:
  - web321:31545:1 twemproxy_memcached_one.841cae5c-bb4f-11e4-8669-56847afe9799
  - web227:31710:1 twemproxy_memcached_one.8bf224cd-bb4f-11e4-8669-56847afe9799
  - web363:31002:1 twemproxy_memcached_one.8bf24bde-bb4f-11e4-8669-56847afe9799
  - web540:31094:1 twemproxy_memcached_one.8bf24bdf-bb4f-11e4-8669-56847afe9799
  - web338:31418:1 twemproxy_memcached_one.8bf272f0-bb4f-11e4-8669-56847afe9799
  - web322:31513:1 twemproxy_memcached_one.8bf29a01-bb4f-11e4-8669-56847afe9799
  - web172:31592:1 twemproxy_memcached_one.8bf29a02-bb4f-11e4-8669-56847afe9799
  - web453:31636:1 twemproxy_memcached_one.8bf2c113-bb4f-11e4-8669-56847afe9799
  - web447:31582:1 twemproxy_memcached_one.8bf2e824-bb4f-11e4-8669-56847afe9799
  - web397:31048:1 twemproxy_memcached_one.8bf30f35-bb4f-11e4-8669-56847afe9799
  - web358:31000:1 twemproxy_memcached_one.8bf33646-bb4f-11e4-8669-56847afe9799
  - web13:31755:1 twemproxy_memcached_one.8bf35d57-bb4f-11e4-8669-56847afe9799
  - web424:31944:1 twemproxy_memcached_one.8bf35d58-bb4f-11e4-8669-56847afe9799
  - web448:31356:1 twemproxy_memcached_one.8bf38469-bb4f-11e4-8669-56847afe9799
  - web457:31854:1 twemproxy_memcached_one.8bf3ab7a-bb4f-11e4-8669-56847afe9799
  - web300:31940:1 twemproxy_memcached_one.8bf3d28b-bb4f-11e4-8669-56847afe9799
  - web305:31128:1 twemproxy_memcached_one.8bf3d28c-bb4f-11e4-8669-56847afe9799
  - web385:31837:1 twemproxy_memcached_one.8bf3f99d-bb4f-11e4-8669-56847afe9799
  - web324:31160:1 twemproxy_memcached_one.8bf420ae-bb4f-11e4-8669-56847afe9799
  - web311:31173:1 twemproxy_memcached_one.8bf420af-bb4f-11e4-8669-56847afe9799
redis_one:
  listen: 127.0.0.1:12334
  hash: fnv1a_64
  distribution: ketama
  redis: true
  auto_eject_hosts: false
  servers:
  - web203:31962:1 twemproxy_redis_one.18ac61be-bb47-11e4-8669-56847afe9799

This happens from time to time, not always.

bobrik avatar Feb 23 '15 11:02 bobrik

Well, it actually happens with this config all the time. Steps to reproduce:

Run with dummy config:

dummy:
  listen: /tmp/dummy-listen
  servers:
    - /tmp/dummy-server:1

Replace it with actual config from previous comment.

telnet 127.0.0.1 12335 and get x:

web381 ~ # telnet 127.0.0.1 12335
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
get x
Connection closed by foreign host.
/bin/nutcracker(nc_stacktrace_fd+0x17)[0x418027]
/bin/nutcracker(signal_handler+0xeb)[0x41574b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf0a0)[0x7fa64eb5d0a0]
/bin/nutcracker(_stats_server_incr+0x25)[0x415595]
/bin/nutcracker(req_server_enqueue_imsgq+0x66)[0x410d36]
/bin/nutcracker[0x410aab]
/bin/nutcracker(req_recv_done+0x151)[0x411291]
/bin/nutcracker(msg_recv+0x160)[0x410430]
/bin/nutcracker(core_core+0x9c)[0x40bd2c]
/bin/nutcracker(event_wait+0x11a)[0x42115a]
/bin/nutcracker(core_loop+0x37)[0x40c237]
/bin/nutcracker(main+0x5dc)[0x40b74c]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd)[0x7fa64e55fead]
/bin/nutcracker[0x40bb09]
[.......................] signal 11 (SIGSEGV) received, core dumping

bobrik avatar Feb 23 '15 11:02 bobrik

@bobrik: Fixed, try again!

vlm avatar Feb 25 '15 05:02 vlm

@vlm seems to be working now, thanks!

bobrik avatar Feb 25 '15 08:02 bobrik

@vlm I did not understand the parameter st->shadow_lock is for what use. In the original twemproxy code, parameter st->aggregate can keep the two thread persistent, right? Why you add shadow_lock? What's the intention?

deep011 avatar May 14 '15 07:05 deep011

@deep011 because swapping array is not atomic. If swapping array is not atomic, you're bound to have problems, like accessing the array elements' content while array is being swapped and stats_pool_reset() is being applied to the elements' contents.

vlm avatar May 14 '15 09:05 vlm

@vlm first, the stats_aggregate() function is used by stats thread and stats_swap() fanction is used by main thread; second, those two functions can not be executed at the same time due to the parameter st->aggregate, so i think the parameter st->shadow_lock didn‘t work. @manjuraj @idning what i say is alright?

deep011 avatar May 15 '15 02:05 deep011

@deep011 http://preshing.com/20120515/memory-reordering-caught-in-the-act/

vlm avatar Jun 23 '15 22:06 vlm

@vlm I read this article, and have you ever encountered problems about this before add "stats->shadow_lock"? Have you communicate with @manjuraj about this issue?

And, hi @manjuraj , could you give me some answer about the "stats->shadow_lock"? It is very important to me!

deep011 avatar Jun 29 '15 08:06 deep011

@manjuraj, ping -- this looks like a great pull request.

atdt avatar Aug 01 '15 21:08 atdt

Any reason this hasn't been merged in yet? We really need it..

davidradunz avatar Sep 12 '15 12:09 davidradunz

+1

JamieCressey avatar Sep 15 '15 07:09 JamieCressey

+1

zhitaoli avatar Sep 16 '15 23:09 zhitaoli

Due to this issue, and the topology of the stack we've switched to Redis Cluster and implemented specialised clients that support it. All of our problems have gone away and things seem much more reliable now and also faster! (though, to clarify, the stack was app -> local haproxy -> twemproxy -> redis)

Feedback: For twemproxy to be a viable option for the corporate infrastructure, it should not only support reloading of the config but also one of the following:

  • Direct support for redis sentinel, blocking client connections when a master is down - waiting for a sentinel notification (for a timeout period, naturally) and then failing over to that.
  • Support for assignment of the slaves for the masters in a pool, when a master goes down querying the slaves to determine which is the new master (whilst blocking client connections until the failover occurs).
  • Direct replacement for sentinel, management of the whole cluster. Blocking client connections when a master goes down and assigning a replacement for it.

davidradunz avatar Sep 18 '15 16:09 davidradunz

Is something holding this up?

sethrosenblum avatar Dec 02 '15 15:12 sethrosenblum

+1 for this patch. Would love to see it merged in so we can run off of the mainline master branch again.

draco2003 avatar Dec 31 '15 12:12 draco2003

+1

axot avatar Feb 04 '16 02:02 axot

+1

ghost avatar Feb 25 '16 18:02 ghost

+1

TrumanDu avatar Feb 26 '16 06:02 TrumanDu

+1

homeyjd avatar Apr 13 '16 04:04 homeyjd

+1

JeffXue avatar Apr 25 '16 09:04 JeffXue

+1

artursitarski avatar May 20 '16 18:05 artursitarski

+1, anything holding up this patch?

alexef avatar Sep 28 '16 08:09 alexef

@andyqzb - can you look into this?

manjuraj avatar Sep 28 '16 15:09 manjuraj

@manjuraj, let me finish this up. I have a list of things I promised you to do about it.

vlm avatar Sep 28 '16 16:09 vlm

any news on this?

blsmth avatar Nov 14 '16 15:11 blsmth

it will be great to have this functionality added... since in the marathon/mesos world the hosts are dynamic... thanks

vsacheti avatar Nov 17 '16 00:11 vsacheti

Sorry to ask again after all the pings but.. Any news? Really looking forward to see this feature merged!

elukey avatar Dec 13 '16 14:12 elukey

+1 Any updates for this PR?

yjqg6666 avatar Dec 14 '16 02:12 yjqg6666