django-elasticsearch-dsl icon indicating copy to clipboard operation
django-elasticsearch-dsl copied to clipboard

Add atomic/zero-downtime index rebuilding

Open oehrlein opened this issue 3 years ago • 33 comments

These changes allow for atomic/zero-downtime index rebuilding using aliases.

There are two new management commands: --atomic and --atomic-no-delete, both which only work with the --rebuild command. If --atomic-no-delete is used with --atomic, the indices previously aliased will not be deleted. If --atomic-no-delete isn't used, the old indices will be deleted.

Regarding naming, the defined index name in documents.py becomes the alias name, and indices take the alias name plus a suffix (hyphen + current datetime). I took example from this.

2 big caveats:

  1. This only works in the simplest use case of just rebuilding and replacing indices - it won't work in the event of continuous updates to an index, as outlined in this article in the section "Wait, I know a case when this will not work…"
  2. As mentioned in this example (same example as above), in order to properly deserialize the data in searches, you will need to add this function to your documents in documents.py:
@classmethod
def _matches(cls, hit):
    # override _matches to match indices in a pattern instead of just ALIAS
    # hit is the raw dict as returned by elasticsearch
    return fnmatch(hit["_index"], ALIAS + "-*")

For this to work, remember to add from fnmatch import fnmatch to your imports in documents.py. Also, replace ALIAS with your alias name, or just make that argument one string, e.g. 'my-alias-*'.

As noted in the example:

BlogPost._matches() class method is required for this code to work since otherwise BlogPost will not be used to deserialize the documents as those will have index set to the concrete index whereas the class refers to the alias.

I didn't get around to writing tests or adding to the documentation. With all that being said, please let me know if you have questions and/or suggestions!

oehrlein avatar Jul 09 '21 09:07 oehrlein

This resolves #314

oehrlein avatar Jul 09 '21 09:07 oehrlein

Thanks for the Pull request, I will review it and add my concern into it. @saadmk11 Can you take a look as well?

safwanrahman avatar Jul 19 '21 20:07 safwanrahman

Looks like there is a similar PR https://github.com/django-es/django-elasticsearch-dsl/pull/284 that does this as a part of it but has been stale for a long time

saadmk11 avatar Jul 28 '21 07:07 saadmk11

Looks like there is a similar PR #284 that does this as a part of it but has been stale for a long time

I tried out the changes in that PR, and unless I misunderstood what was documented, it didn't do exactly what I was looking for. Regardless, yea, it's been stale for a while.

oehrlein avatar Jul 29 '21 02:07 oehrlein

This seems like a good approach. Though I was thinking to different way to achieve it, but it will achieve the purpose with minimal changes. I will test it locally before mege. I have added a little fix over.

safwanrahman avatar Aug 03 '21 13:08 safwanrahman

First of all, thanks for tackling this one @oehrlein

About the continuous updates to an index issue, I think we can mitigate it running combined your solution with populate command, this way:

First we run search_index --rebuild --atomic secondly we run a search_index populate in order to index documents that were inserted during the migration

One caveat is that items inserted during the migration will be inconsistent for a brief moment, the items will not be available for search until we run the populate command. I believe this is acceptable for search engines like ES that are not strongly consistent and the most important thing: we are not loosing any document during the migration.

Another caveat is that we will have to write documents 2 times in this can be an performance bottleneck for bigger clusters.

antunesleo avatar Aug 03 '21 14:08 antunesleo

@safwanrahman Sounds good, I appreciate it!

oehrlein avatar Aug 04 '21 08:08 oehrlein

@antunesleo No problem!

I think for now (and considering it doesn't require any changes to the code), the combination of those two commands works well enough.

Of course, it would be good to eventually try to have it similar to the article I linked in my original comment. Although, it's admittedly not a need of mine at the moment, so I won't be able to get around to it for the time being.

oehrlein avatar Aug 04 '21 09:08 oehrlein

Codecov Report

Merging #358 (ce05715) into master (6470dd7) will decrease coverage by 8.95%. The diff coverage is 42.10%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #358      +/-   ##
==========================================
- Coverage   67.85%   58.90%   -8.96%     
==========================================
  Files          12       12              
  Lines         756      910     +154     
==========================================
+ Hits          513      536      +23     
- Misses        243      374     +131     
Impacted Files Coverage Δ
...sticsearch_dsl/management/commands/search_index.py 28.70% <37.50%> (-13.50%) :arrow_down:
django_elasticsearch_dsl/documents.py 77.41% <100.00%> (+0.75%) :arrow_up:
django_elasticsearch_dsl/fields.py 61.45% <100.00%> (-0.20%) :arrow_down:
django_elasticsearch_dsl/registries.py 74.16% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 79218c8...ce05715. Read the comment docs.

codecov-commenter avatar Sep 16 '21 02:09 codecov-commenter

@safwanrahman @saadmk11 I made the changes for all your guys' suggestions, so it should all be good to merge!

oehrlein avatar Sep 16 '21 04:09 oehrlein

Thanks a lot @oehrlein for your patience and patch that is very good design. @saadmk11 Can you test it with any application and check if it is working perfectly? I am being too busy with personal things so maybe it would be better if you can test it and let me know. I will merge it after you test it.

safwanrahman avatar Sep 16 '21 08:09 safwanrahman

@safwanrahman No problem, and thanks!

@saadmk11 Let me know if anything needs to be changed!

oehrlein avatar Sep 16 '21 08:09 oehrlein

@saadmk11 Let me know if anything needs to be changed!

Hi @oehrlein, Thanks for updating the PR with the requested changes. :100:

I tried to test these changes locally (by running python manage.py search_index --rebuild --use-alias) on an application that already had an index created on the Elasticsearch cluster (using the rebuild command of the current version of django-elasticsearch-dsl).

I am getting this error when I run the command:

Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 262, in handle
    self._rebuild(models, options)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 242, in _rebuild
    options
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 179, in _update_alias
    es_conn.indices.update_aliases({"actions": alias_actions})
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 301, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/client/indices.py", line 612, in update_aliases
    "POST", "/_aliases", params=params, headers=headers, body=body
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/transport.py", line 458, in perform_request
    raise e
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/transport.py", line 426, in perform_request
    timeout=timeout,
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 287, in perform_request
    self._raise_error(response.status, raw_data)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/connection/base.py", line 338, in _raise_error
    status_code, error_message, additional_info
elasticsearch.exceptions.RequestError: RequestError(400, 'invalid_alias_name_exception', 'Invalid alias name [some_index_name], an index exists with the same name as the alias')

(I think this is happening because we are trying to create an alias with the same name of the index. As the index name does not have any timestamp/uniqueness on the current implementation it is conflicting with the alias name)

But If I delete the index on ElasticSearch and then run the rebuild command it works as expected because the newly created index name has timestamp/uniqueness according to the PR implementation.

Please let me know If I need to add/update anything to test the changes of this PR. :)

saadmk11 avatar Sep 26 '21 17:09 saadmk11

@saadmk11 No problem!

Thanks for catching this error! I ran into it before, but forgot about it before my original commits.

I made adjustments to fix the issue. Now, when we want an alias to have the name of an index that already exists, it deletes the existing index as well. Likewise, I made changes so that when we want an index to have the name of an alias that already exists, it deletes the indices pointed at the alias to free up the name.

I also made adjustments to account for deleting indices that were created with aliases (after running --rebuild with --use-alias, run --delete with --use-alias to delete those indices).

In addition, I added additional stdout messages in case of doing something alias-specific, but only having regular indices, or vice-versa. For example, trying to delete regular indices after creating indices with an alias (after running --rebuild with --use-alias, run --delete without --use-alias) provides a message explaining the situation and how to resolve it.

Take a look at the changes when you get a chance, and let me know if you have any suggestions and/or think there should be changes!

oehrlein avatar Sep 30 '21 07:09 oehrlein

@oehrlein Thanks a lot for updating. The test are failing unfortunately. 😢 Can you fix it?

safwanrahman avatar Sep 30 '21 09:09 safwanrahman

@safwanrahman I fixed the issues from my end that were causing the tests to fail, but looks like tests are failing for everyone due to another issue (#377).

oehrlein avatar Sep 30 '21 19:09 oehrlein

@safwanrahman I was able to also fix the test that was failing for everyone! All the checks have passed :)

oehrlein avatar Oct 01 '21 01:10 oehrlein

@oehrlein Thanks a lot for fixing the tests and other fixes. As the test is failing, I have cherry-picked your test fix commit and pushed that commit only to master. I will merge this PR after @saadmk11 give it another try. @saadmk11 Will you have some time to review the PR code and test it locally please? It would be much appreciated.

safwanrahman avatar Oct 01 '21 20:10 safwanrahman

@safwanrahman Thanks, and sounds good!

oehrlein avatar Oct 02 '21 02:10 oehrlein

@saadmk11 Thanks for the kind words and the help + suggestions!

I took your suggestion regarding the _matches function and added it in ce05715. While I understand that having multiple, similarly-named indices will cause matching issues, I think that's less likely to happen than someone trying to use aliases and forgetting to override the _matches function on their own.

There's just 2 suggestions that I'm looking for your response for - everything else should be good! :)

oehrlein avatar Oct 06 '21 08:10 oehrlein

I took your suggestion regarding the _matches function and added it in ce05715. While I understand that having multiple, similarly-named indices will cause matching issues, I think that's less likely to happen than someone trying to use aliases and forgetting to override the _matches function on their own.

Thanks for adding this. I would like to know if @safwanrahman has any thoughts on this.

saadmk11 avatar Oct 14 '21 14:10 saadmk11

@saadmk11 No problem!

@safwanrahman Let us know!

oehrlein avatar Oct 14 '21 17:10 oehrlein

I need to take a deeper look into the PR again. i was on a long vacation recently, so it may take some days to catch things up. I am sorry about that. I will reply the queries once I read the PR.

safwanrahman avatar Oct 18 '21 23:10 safwanrahman

@safwanrahman Sounds good, and no worries! Hopefully the vacation was good :)

oehrlein avatar Oct 19 '21 01:10 oehrlein

@safwanrahman Any luck at being able to take a look at this?

oehrlein avatar Nov 19 '21 06:11 oehrlein

@safwanrahman Just wanted to follow up again - any luck at being able to take a look at this?

oehrlein avatar Jan 29 '22 06:01 oehrlein

We're really like to see this too @safwanrahman if you are able to review. Thank you for your time

violuke avatar Mar 02 '22 11:03 violuke

Hi, any news about the merge of this really useful feature?

rsommerard avatar Mar 21 '22 16:03 rsommerard

i found that this source code does not erase new indexes when an error occurs in process of creating alias

Root-kjh avatar Apr 04 '22 06:04 Root-kjh

Would be great to see this PR merged!!

@oehrlein any chance you are considering merging this?

phpaulin avatar Jul 18 '22 14:07 phpaulin