git2consul icon indicating copy to clipboard operation
git2consul copied to clipboard

Randomly deleting keys

Open guillaumeger opened this issue 8 years ago • 5 comments

It seems that git2consul is deleting keys when there is a problem with consul:

{"name":"git2consul","hostname":"git2consul-1489113718-3a5mk","pid":1,"level":40,"msg":"Failed to handle pull due to Error: consul: kv.get: rpc error: No cluster leader, clearing cache and retrying","time":"2016-11-23T23:56:35.436Z","v":0}

{"name":"git2consul","hostname":"git2consul-1489113718-3a5mk","pid":1,"level":40,"msg":"Purging branch cache /var/lib/git2consul_cache/lp/production for branch production in repo lp","time":"2016-11-23T23:56:35.436Z","v":0}

{"name":"git2consul","hostname":"git2consul-1489113718-3a5mk","pid":1,"level":10,"msg":"Deleting key lp/production/faq-bo","time":"2016-11-23T23:56:36.186Z","v":0}

{"name":"git2consul","hostname":"git2consul-1489113718-3a5mk","pid":1,"level":50,"msg":"Uncaught exception ReferenceError: key_name is not defined","time":"2016-11-23T23:56:36.234Z","v":0}

guillaumeger avatar Dec 02 '16 19:12 guillaumeger

Did some quick visual stack tracking and problem seems to likely stem from Branch.prototype.handleRefChange() -> Branch.prototype.pull() -> getLastProcessedRef() -> consul.kv.get().

handleRefChange() should perform retries on consul-related error instead for purging the cache.

calvn avatar Dec 07 '16 06:12 calvn

Hello calvn,

I was taking a look at the code and I'm trying to understand how did you figure out that the error came from a consul.kv.get call....

From what I understand after reading guillaumeger's logs, the key_name is not defined would come from "consul.kv.del" in the file_deleted function (right after Deleting key logger trace)

Thanks a lot for the clues!

samifruit514 avatar Mar 14 '17 20:03 samifruit514

@samifruit514 you are right, it seems that the exception is thrown by consul.kv.del().

  1. The consul network experiences a hiccup during polling.
  2. git2consul triggers a cache purge.
  3. Local cache is emptied which triggers deletion on KVs.
  4. consul.kv.del() gets called but it's unable to perform the deletion.

Now, I wonder why it thinks key_name is undefined if the previous message logged key_name as "lp/production/faq-bo".

calvn avatar Mar 15 '17 05:03 calvn

Hello calvn,

We just noticed that we had this error message when running with 0.12.12 which has few paths where key_name is not defined (in lib/consul/index.js):

cb('Failed to delete key ' + key_name + ' due to ' + err);

These calls are located in handle_expanded_keys_with_different_file_types function.

Again, this is in 0.12.12. 0.12.13 seem to had fixed this.

I will try to reproduce this error in 0.12.12 to be sure this wont happend in 0.12.13 once we run it in our production environment.

Thanks again!

samifruit514 avatar Mar 15 '17 14:03 samifruit514

hello @calvn ,

with 0.12.13, we have problems exactly like you described in your first reply in this thread. I made a PR that retries when the connection with consul fails.

Please let me know what you think about it https://github.com/Cimpress-MCP/git2consul/pull/143

Thanks a lot!

samifruit514 avatar Mar 21 '17 15:03 samifruit514