spring-data-elasticsearch
spring-data-elasticsearch copied to clipboard
Support the Task API
Spring Data Elasticsearch should add support for the Task API (https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html):
- add methods to the
ClusterOperations
according to the ES API - add methods like `submitDelete(Query) (https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-task-api) that create tasks
Could we talk a little bit about this?
add methods to the ClusterOperations according to the ES API
What methods you mean? Methods which returns all active tasks with details?
add methods like
submitDelete(Query)
This methods should be placed in DocumentOperations
I guess?
What should be shape of Task
class? I think could be something like this:
class Task {
private String node;
private List<ChildTask> childTasks;
// and a few more properties I think
}
class ChildTask {
private TaskId taskId; //org.elasticsearch.tasks.TaskId
// and a few more properties I think
}
And I would talk about how someone should check if task is done. Or how can I cancel particular task? I mean high level usage of this feature.
I just had a look at the code of the RestHighLevelClient
in version 7.15.1. There the tasks API is implemented in an own TasksClient
(https://github.com/elastic/elasticsearch/blob/master/client/rest-high-level/src/main/java/org/elasticsearch/client/TasksClient.java). Therefore in Spring Data Elasticsearch this should be defined in a (Reactive)TasksOperations
interface and implemented in (Reactive)TasksTemplate
classes that integrate with the existing implementations.
As to what properties should be in a task, this should reflect the data that is available in the org.elasticsearch.client.tasks
package of the RestHighLevelClient
.
But I currently see two problems.
- The first is that the tasks API is still marked as beta feature (https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html).
- The tasks API was introduced into the client code after version 7.10. This means that it is not licensed with the Apache2 license. While we can use the classes from the
RestHighLevelClient
to access that API, we cannot easily implement this in the reactive code. For the reactive code, Spring Data Elasticsearch contains modified copies of some classes from the core Elasticsearch libraries for creating request objects and converting response objects. It was no problem to copy and modify that code when we implemented the reactive code, but we cannot take the new parts needed for the tasks API like before, because these now have a different license.
Elasticsearch works on providing a new client (https://github.com/elastic/elasticsearch-java) and I am currently working on integrating this client as first an alternative and later as replacement to the RestHighLevelClient
. The request and response classes from this new client are Apache2 licensed and Spring Data Elasticsearch will use these for both imperative and reactive code.
So I would defer implementing the tasks API in Spring Data Elasticsearch to the integration of the new client, I would not want to have it implemented only for the imperative but not the reactive code.
In the current 8.12 version this is still a feature in beta status. I'll close this issue for now. When the task api is stable we should create a new issue then.