bloomd icon indicating copy to clipboard operation
bloomd copied to clipboard

rename filter command

Open jtv4k opened this issue 9 years ago • 7 comments

It would be great to have server supported atomic renames.

We have a script that builds several large bloom filters, which can take a while. It would be great if we could build the filters under temporary names and then (atomically) rename the all the filters:

rename temp_filter_1 filter_1, temp_filter_2 filter_2, temp_filter_3 filter_3

That would allow us to build the bloom filters in the background under a pseudonym, then checks against the final filter name will fail. Once the build completes, we perform the renames and suddenly the client checks begin working.

Client 1:

c filter_1 key
Filter does not exist

Client 2:

rename temp_filter_1 filter_1

Client 1:

c filter_1 key
No

jtv4k avatar Jun 30 '15 19:06 jtv4k

@jtv4k You mean to rename to an existing filter? e.g. an implicit drop of the existing filter then?

armon avatar Jun 30 '15 21:06 armon

@armon Yes, rename an existing filter so the previous name is impliticly dropped. It could be a copy-then-drop action but that would seem less efficient than a rename. The new filter name would contain all the data of the old.

For example:

list
START
END
create temp_filter_1
Done
set temp_filter_1 somevalue
Yes
check filter_1 somevalue
Filter does not exist
rename temp_filter_1 filter_1
Done
check temp_filter_1 somevalue
Filter does not exist
check filter_1 somevalue
Yes

jtv4k avatar Jun 30 '15 22:06 jtv4k

@jtv4k Hmm. This would be a pretty complex change internally ATM. The underlying files are at a path based on the filter name, so we'd have to support renaming all those files, while handling the fact that the delete of the old filter is taking place in the background. So it would be something like:

  1. Rename temp_filter to filter such that new requests go to that
  2. Background delete original filter
  3. Migrate temp_filter to filter location on disk (rename files)

The issue is that you need to handle partial failures in all that. It could be done, you'd need like a write ahead log of sorts. It's just not something that is clean or simple. Unfortunately I'm very engaged with my work at HashiCorp and don't have time to tackle something like this.

armon avatar Jun 30 '15 23:06 armon

Same functional request ... Use-case : 1- Large bloom datasets are built off-line "somewhere else" , version X. 2- Many clients read the current version of theses bloom filters ( eg through "bloomd" ). 3- New versions of these bloom datasets are built off-line "somewhere else" , version "X+1". 4- How to smoothly change the clients from the version X to version "X+1" ?

jtv4k proposes "atomic rename" of the files, which seems difficult, according to Armon.

Can we imagine other solutions ? ( using several boomd in parralel , .. ? )

Revolle avatar Dec 13 '16 16:12 Revolle

One proposal for the use-case exposed just before: A load-balancer, with health-check layer7-dummy-HTTP-200, connected on two bloomd ( blooomdA and bloomdB ) with same "tcp_port=8673". "bloomdA" has "data_dir=/mnt/bloomdA". "bloomdB" has "data_dir=/mnt/bloomdB" 1- version X is written in /mnt/bloomdA 2- bloomdA is started, and declares himself up on layer7 ( responding 200 on http-dummy-request ) : load-balancer sends user's request on bloomdA 3 - version X+1 is written in /mnt/bloomdB 4- bloomdB is started, and declares himself up on layer7 : load-balancer sends user's request on bloomdA(version X), and bloomdB(version X+1) 5- bloomdA receives a "command-to-stop" (TBD), and declares himself down on layer7 ( responding 400 on http-dummy-request ) : load-balancer stops to send user's request on bloomdA(version X), and load-balancer sends all new requests on bloomdB(versionX+1). bloomdA can die smoothly. 6- version X+2 can be written on /mnt/bloomdA 7- etc...

This solution relies on load-balancer, and needs two enhacements in bloomd: 1- bloomd must answer to dummy "keep-alive" on layer 7 http ( dummy answer 200 when "on", or 400 when "off" ) 2- bloomd must accept a "command-to-stop" : answer 400 on layer-7-http keep-alive request, and die when no more request pending ( or after a delay )

Revolle avatar Dec 13 '16 17:12 Revolle

Depending on the app, it might be simpler to simpler coordinate a switch over from filterA to filterB. For example, using something like Consul (shameless plug, I apologize), you could set a key like service/foobar/bloomd_set = filterA and then update that key when the offline build is done. Applications can be edge-triggered when that key changes and start using a new key. This seems like the simplest option, as it requires no load balancer, or bloomd changes. The client is just deciding which filter to use and something like Consul/ZooKeeper can be used to coordinate the change.

armon avatar Dec 15 '16 05:12 armon

You are right !

I'm going in your direction : I build something over the bloomd to manage the usecase.

Do you wnt me to auto-answer in github ?

thank you for your answer

Franck

 

 

 

 

 

Message du 15/12/16 06:04 De : "Armon Dadgar" A : "armon/bloomd" Copie à : "Revolle" , "Comment" Objet : Re: [armon/bloomd] rename filter command (#33)

Depending on the app, it might be simpler to simpler coordinate a switch over from filterA to filterB. For example, using something like Consul (shameless plug, I apologize), you could set a key like service/foobar/bloomd_set = filterA and then update that key when the offline build is done. Applications can be edge-triggered when that key changes and start using a new key. This seems like the simplest option, as it requires no load balancer, or bloomd changes. The client is just deciding which filter to use and something like Consul/ZooKeeper can be used to coordinate the change.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/armon/bloomd","title":"armon/bloomd","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/armon/bloomd"}},"updates":{"snippets":[{"icon":"PERSON","message":"@armon in #33: Depending on the app, it might be simpler to simpler coordinate a switch over from filterA to filterB. For example, using something like Consul (shameless plug, I apologize), you could set a key like service/foobar/bloomd_set = filterA and then update that key when the offline build is done. Applications can be edge-triggered when that key changes and start using a new key. This seems like the simplest option, as it requires no load balancer, or bloomd changes. The client is just deciding which filter to use and something like Consul/ZooKeeper can be used to coordinate the change.\r\n\r\n"}],"action":{"name":"View Issue","url":"https://github.com/armon/bloomd/issues/33#issuecomment-267239759"}}}

Revolle avatar Dec 15 '16 16:12 Revolle