elastic
elastic copied to clipboard
how to access stored field and scroll_id
Hello, I am having difficulties to pull stored field and scroll_id.
Stored field: The field is called "text" and in Kibana I can see it is present for the index "document-000002". When specifying "text" as a value for parameter stored_fields I don't get it pulled, instead only "_index", "_type", "_id", "_score" and "_source" are present in the resulting list (first two lines of code). When I tested the line with source parameter, element "_source" was an empty list.
An exemplary record from ES, accessed via Kibana:
{
"_index": "document-000002",
"_type": "_doc",
"_id": "AS_63689606",
"_version": 1,
"_score": 1,
"_source": {
"visitid": "65_63209606",
"processingdate": "2022-08-24 17:24-0400",
"gender": "male",
"facility": "40998",
"user": "JOHNDOE",
"customer": "656"
},
"fields": {
"processingdate": [
"2022-08-24T21:24:00.000Z"
],
"servicedate": [
"2022-08-22T22:05:00.000Z"
],
"text": [
"an exemplary text I want to pull"
]
}
}
Tried code:
library(elastic)
docs <- Search(c, "document-000002", size = 8, stored_fields = "text")$hits$hits
docs <- Search(c, "document-000002", size = 8, stored_fields = c("text", "servicedate"))$hits$hits
docs <- Search(c, "document-000002", size = 8, source = "text")$hits$hits
scroll_id I would like to use scroll parameter to pull more than the default 10K documents for the same index. I see it should be possible, because:
all_docs <- Search(conn = c, index = "document-000002")
all_docs$hits$total$value
all_docs$`_scroll_id`
total hits amount to more than 8 millions. However, scroll ID is always NULL
I will appreciate any help.
ES version in use: 7.3.1 Elastic package version in use: 1.2.0
Did you try to set the time_scroll parameter of Search()?
See also https://docs.ropensci.org/elastic/articles/search.html#scrolling-search---instead-of-paging
@cphaarmeyer thank you! Specifying parameter time_scroll for Search() was sufficient to access _scroll_id.
Therefore, the working code looks like this:
all_docs <- Search(conn = c, index = "document-000002", time_scroll = "1m")
all_docs$`_scroll_id`
Do you have any thoughts on how to pull stored field?
Do you mean something like this?
docs <- Search(conn = c, index = "document-000002", size = 6, body = list(`_source` = "text"))
lapply(docs$hits$hits, function(x) x[["_source"]][[1]])
Indeed, code you propose is somehow suggested in Search() documentation, and I also tried it (although not as a part of the body). However, it doesn't work, because "text" is a stored field, not a part of "_source" (see structure of the record I pasted as a part of my question). According to documentation, it should be pulled by specifying stored_fields parameter, but it is not the case.
Oh sorry. Then I don't know. I have never seen such a setup.