django-elasticsearch-dsl
django-elasticsearch-dsl copied to clipboard
Add atomic/zero-downtime index rebuilding
These changes allow for atomic/zero-downtime index rebuilding using aliases.
There are two new management commands: --atomic
and --atomic-no-delete
, both which only work with the --rebuild
command. If --atomic-no-delete
is used with --atomic
, the indices previously aliased will not be deleted. If --atomic-no-delete
isn't used, the old indices will be deleted.
Regarding naming, the defined index name in documents.py
becomes the alias name, and indices take the alias name plus a suffix (hyphen + current datetime). I took example from this.
2 big caveats:
- This only works in the simplest use case of just rebuilding and replacing indices - it won't work in the event of continuous updates to an index, as outlined in this article in the section "Wait, I know a case when this will not work…"
- As mentioned in this example (same example as above), in order to properly deserialize the data in searches, you will need to add this function to your documents in
documents.py
:
@classmethod
def _matches(cls, hit):
# override _matches to match indices in a pattern instead of just ALIAS
# hit is the raw dict as returned by elasticsearch
return fnmatch(hit["_index"], ALIAS + "-*")
For this to work, remember to add from fnmatch import fnmatch
to your imports in documents.py
. Also, replace ALIAS
with your alias name, or just make that argument one string, e.g. 'my-alias-*'
.
As noted in the example:
BlogPost._matches() class method is required for this code to work since otherwise BlogPost will not be used to deserialize the documents as those will have index set to the concrete index whereas the class refers to the alias.
I didn't get around to writing tests or adding to the documentation. With all that being said, please let me know if you have questions and/or suggestions!
This resolves #314
Thanks for the Pull request, I will review it and add my concern into it. @saadmk11 Can you take a look as well?
Looks like there is a similar PR https://github.com/django-es/django-elasticsearch-dsl/pull/284 that does this as a part of it but has been stale for a long time
Looks like there is a similar PR #284 that does this as a part of it but has been stale for a long time
I tried out the changes in that PR, and unless I misunderstood what was documented, it didn't do exactly what I was looking for. Regardless, yea, it's been stale for a while.
This seems like a good approach. Though I was thinking to different way to achieve it, but it will achieve the purpose with minimal changes. I will test it locally before mege. I have added a little fix over.
First of all, thanks for tackling this one @oehrlein
About the continuous updates to an index issue, I think we can mitigate it running combined your solution with populate command, this way:
First we run search_index --rebuild --atomic secondly we run a search_index populate in order to index documents that were inserted during the migration
One caveat is that items inserted during the migration will be inconsistent for a brief moment, the items will not be available for search until we run the populate command. I believe this is acceptable for search engines like ES that are not strongly consistent and the most important thing: we are not loosing any document during the migration.
Another caveat is that we will have to write documents 2 times in this can be an performance bottleneck for bigger clusters.
@safwanrahman Sounds good, I appreciate it!
@antunesleo No problem!
I think for now (and considering it doesn't require any changes to the code), the combination of those two commands works well enough.
Of course, it would be good to eventually try to have it similar to the article I linked in my original comment. Although, it's admittedly not a need of mine at the moment, so I won't be able to get around to it for the time being.
Codecov Report
Merging #358 (ce05715) into master (6470dd7) will decrease coverage by
8.95%
. The diff coverage is42.10%
.
@@ Coverage Diff @@
## master #358 +/- ##
==========================================
- Coverage 67.85% 58.90% -8.96%
==========================================
Files 12 12
Lines 756 910 +154
==========================================
+ Hits 513 536 +23
- Misses 243 374 +131
Impacted Files | Coverage Δ | |
---|---|---|
...sticsearch_dsl/management/commands/search_index.py | 28.70% <37.50%> (-13.50%) |
:arrow_down: |
django_elasticsearch_dsl/documents.py | 77.41% <100.00%> (+0.75%) |
:arrow_up: |
django_elasticsearch_dsl/fields.py | 61.45% <100.00%> (-0.20%) |
:arrow_down: |
django_elasticsearch_dsl/registries.py | 74.16% <100.00%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 79218c8...ce05715. Read the comment docs.
@safwanrahman @saadmk11 I made the changes for all your guys' suggestions, so it should all be good to merge!
Thanks a lot @oehrlein for your patience and patch that is very good design. @saadmk11 Can you test it with any application and check if it is working perfectly? I am being too busy with personal things so maybe it would be better if you can test it and let me know. I will merge it after you test it.
@safwanrahman No problem, and thanks!
@saadmk11 Let me know if anything needs to be changed!
@saadmk11 Let me know if anything needs to be changed!
Hi @oehrlein, Thanks for updating the PR with the requested changes. :100:
I tried to test these changes locally (by running python manage.py search_index --rebuild --use-alias
) on an application that already had an index created on the Elasticsearch cluster (using the rebuild command of the current version of django-elasticsearch-dsl
).
I am getting this error when I run the command:
Traceback (most recent call last):
File "manage.py", line 21, in <module>
main()
File "manage.py", line 17, in main
execute_from_command_line(sys.argv)
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
utility.execute()
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
self.execute(*args, **cmd_options)
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/base.py", line 364, in execute
output = self.handle(*args, **options)
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 262, in handle
self._rebuild(models, options)
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 242, in _rebuild
options
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 179, in _update_alias
es_conn.indices.update_aliases({"actions": alias_actions})
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 301, in _wrapped
return func(*args, params=params, headers=headers, **kwargs)
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/client/indices.py", line 612, in update_aliases
"POST", "/_aliases", params=params, headers=headers, body=body
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/transport.py", line 458, in perform_request
raise e
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/transport.py", line 426, in perform_request
timeout=timeout,
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 287, in perform_request
self._raise_error(response.status, raw_data)
File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/connection/base.py", line 338, in _raise_error
status_code, error_message, additional_info
elasticsearch.exceptions.RequestError: RequestError(400, 'invalid_alias_name_exception', 'Invalid alias name [some_index_name], an index exists with the same name as the alias')
(I think this is happening because we are trying to create an alias with the same name of the index. As the index name does not have any timestamp/uniqueness on the current implementation it is conflicting with the alias name)
But If I delete the index on ElasticSearch and then run the rebuild command it works as expected because the newly created index name has timestamp/uniqueness according to the PR implementation.
Please let me know If I need to add/update anything to test the changes of this PR. :)
@saadmk11 No problem!
Thanks for catching this error! I ran into it before, but forgot about it before my original commits.
I made adjustments to fix the issue. Now, when we want an alias to have the name of an index that already exists, it deletes the existing index as well. Likewise, I made changes so that when we want an index to have the name of an alias that already exists, it deletes the indices pointed at the alias to free up the name.
I also made adjustments to account for deleting indices that were created with aliases (after running --rebuild
with --use-alias
, run --delete
with --use-alias
to delete those indices).
In addition, I added additional stdout messages in case of doing something alias-specific, but only having regular indices, or vice-versa. For example, trying to delete regular indices after creating indices with an alias (after running --rebuild
with --use-alias
, run --delete
without --use-alias
) provides a message explaining the situation and how to resolve it.
Take a look at the changes when you get a chance, and let me know if you have any suggestions and/or think there should be changes!
@oehrlein Thanks a lot for updating. The test are failing unfortunately. 😢 Can you fix it?
@safwanrahman I fixed the issues from my end that were causing the tests to fail, but looks like tests are failing for everyone due to another issue (#377).
@safwanrahman I was able to also fix the test that was failing for everyone! All the checks have passed :)
@oehrlein Thanks a lot for fixing the tests and other fixes. As the test is failing, I have cherry-picked your test fix commit and pushed that commit only to master. I will merge this PR after @saadmk11 give it another try. @saadmk11 Will you have some time to review the PR code and test it locally please? It would be much appreciated.
@safwanrahman Thanks, and sounds good!
@saadmk11 Thanks for the kind words and the help + suggestions!
I took your suggestion regarding the _matches
function and added it in ce05715. While I understand that having multiple, similarly-named indices will cause matching issues, I think that's less likely to happen than someone trying to use aliases and forgetting to override the _matches
function on their own.
There's just 2 suggestions that I'm looking for your response for - everything else should be good! :)
I took your suggestion regarding the
_matches
function and added it in ce05715. While I understand that having multiple, similarly-named indices will cause matching issues, I think that's less likely to happen than someone trying to use aliases and forgetting to override the_matches
function on their own.
Thanks for adding this. I would like to know if @safwanrahman has any thoughts on this.
@saadmk11 No problem!
@safwanrahman Let us know!
I need to take a deeper look into the PR again. i was on a long vacation recently, so it may take some days to catch things up. I am sorry about that. I will reply the queries once I read the PR.
@safwanrahman Sounds good, and no worries! Hopefully the vacation was good :)
@safwanrahman Any luck at being able to take a look at this?
@safwanrahman Just wanted to follow up again - any luck at being able to take a look at this?
We're really like to see this too @safwanrahman if you are able to review. Thank you for your time
Hi, any news about the merge of this really useful feature?
i found that this source code does not erase new indexes when an error occurs in process of creating alias
Would be great to see this PR merged!!
@oehrlein any chance you are considering merging this?