elasticsearch-net
elasticsearch-net copied to clipboard
Default index is not used in query with 8.13.15 client
Elastic.Clients.Elasticsearch version: 8.13.15
Elasticsearch version: 8.13.2
.NET runtime version: .NET 6.0.28 and .NET Framework 4.8.9232.0
Operating system version: Windows 10 Pro
Description of the problem including expected versus actual behavior:
I set the DefaultIndex to "features" while configuring ElasticsearchClientSettings, but the request was sent without using this DefaultIndex.
searchResult.DebugInformation: Valid Elasticsearch response built from a successful (200) low-level call on POST: /_search?typed_keys=true.
similar to https://github.com/elastic/elasticsearch-net/issues/8151
Steps to reproduce:
var connectionSettings = new ElasticsearchClientSettings(settings.ReadonlyHost);
connectionSettings.DefaultIndex("features");
connectionSettings.RequestTimeout(TimeSpan.FromMinutes(1));
connectionSettings.EnableHttpCompression();
connectionSettings.Authentication(new BasicAuthentication(settings.Login, settings.Password));
var client = new ElasticsearchClient(connectionSettings);
var searchResult = await client.SearchAsync<JsonElement>(s => s
.Aggregations(aggs => aggs.Add("ClassName", agg => agg.Terms(dh => dh.Field("ClassId").Size(100)))));
// or
var searchRequest = new SearchRequest()
{
Size = 10,
Query = new TermQuery(new Field("ClassId")) { Value = FieldValue.Long(75) }
};
var searchResult2 = await client.SearchAsync<JsonElement>(searchRequest);
Expected behavior
I expect http request with index POST: features/_search?typed_keys=true
Hi @krasheninnik, this is a similar case as described in #8184
Currently this is the expected behavior as stated here.
However, it seems like this is not what users intuitively expect when setting a DefaultIndex or DefaultMapping. I will do some research on how we could achieve the best developer experience in this case, before I decide how the index inferrence should work in the future.
This may well be true for DefaultIndex, but if the user specifies a default index for a specific type, and calls the API with that specific type as the generic argument, then I believe the SDK must use that default index..
@jonyadamit I agree, that the current behavior might not be optimal and, like stated above, I'm open to improve the default 🙂 I just have not decided how exactly the best default behavior would look like.
Even when the user specifies an index mapping for a specific type, let's say LogEntry => "default_log_index", this creates potentially unwanted behavior in APIs that optionally accept multiple indices (Indices parameter). For example, let's say we use the Search<LogEntry>() method.
The default behavior would be to return all LogEntry items in all indices. There might be rotating log indices like e.g. log_07_2024, log_08_2024, etc. and all of these indices would be searched.
Setting the default index/mapping for LogEntry changes semantics so that only default_log_index would be searched. To explicitly revert the behavior, the user must call .Index(Indices.All) or set Indices to null. In my opinion this is not intuitive.
It's totally fine for APIs that only accept a single index (IndexName parameter), because all of these ones actually require an index to be set. Not setting it will cause an error.
I see what you mean.
I am migrating from version 7 to version 8, and the default behavior failed miserably because it tried to deserialize a different type. So I guess the current default behavior doesn't return all LogEntry items in all indices, but rather all items everywhere.
This is regarding a simple query such as await client.SearchAsync<LogEntry>(s => s.Query(new MatchAllQuery())).ConfigureAwait(false);. This definitely doesn't seem right.
I suppose we can let the user decide the default behavior with another setting.
At the minimum, it would be helpful to update the migration guide.
@jonyadamit
So I guess the current default behavior doesn't return all LogEntry items in all indices, but rather all items everywhere.
That's correct. By default, Elasticsearch will return all matching documents (literally all documents in your match_all example). Elasticsearch used to have a _type meta-data that was used to discriminate different types. This no longer exists out of the box, but I see a lot of users adding custom type fields to their documents. A real-world query would contain at least one condition to check for the correct type term (e.g. log_entry).
As I said, I agree that the current behavior is not ideal either and I'm definitely open to improve it.
Come to think of it, in my opinion at least, If the user specifies a default index, they expect it to be used even with APIs that optionally accept indices. If the user has multiple indices and plan to use them, they will need to specify them anyhow, using wildcards or aliases.. and if they truly want to search all indices then it is very legitimate to specify .Index(Indices.All) or not to set a default index for the type. Just my opinion.
Is there any development on this? Am I correct in understanding that if I've previously been relying on a defaultIndex setting I will now have to specifically set the index I'm querying against for every search request, or did I misunderstand something?
@stefanobranco I don't know if that will help you in any way, but ElasticsearchClient is holding all the mappings in the ElasticsearchClientSettings.DefaultIndices so you're able to create some extension methods like:
private static string GetIndexFor<T>(this ElasticsearchClient elasticsearch) =>
elasticsearch.ElasticsearchClientSettings.DefaultIndices.TryGetValue(typeof(T), out var indexName) ? indexName : throw new ArgumentException($"Unable to find mapping for {typeof(T).Name}");
public static SearchResponse<T> SearchIndex<T>(this ElasticsearchClient elasticClient, Action<SearchRequestDescriptor<T>> desc) where T : class =>
elasticClient.Search<T>(s => desc(s.Indices(elasticClient.GetIndexFor<T>())));
It's oversimplified and It's not ideal but I think it can be used as alternative to setting index name all the time.