fusionauth-issues icon indicating copy to clipboard operation
fusionauth-issues copied to clipboard

User searching for first and lastname not possible using ElasticSearch search engine

Open nikos opened this issue 4 years ago • 18 comments

When using FusionAuth 1.15.5 with the Java SDK it seems that the documented fields (see https://fusionauth.io/docs/v1/tech/apis/users#search-for-users) do NOT work:

  • last_name
  • first_name
  • full_name but only the field named fullName (camel case) is returing the expected users.

Kotlin example:

        val response = fusionAuthClient.searchUsersByQuery(
                SearchRequest(UserSearchCriteria().apply {
                    queryString = "fullName:Joe"
                    startRow = offset
                    numberOfResults = pageSize
                    sortFields = listOf(SortField("username"))
                }))

Related

  • https://github.com/FusionAuth/fusionauth-issues/issues/30
  • https://github.com/FusionAuth/fusionauth-issues/issues/602
  • https://github.com/FusionAuth/fusionauth-issues/issues/1639
  • https://github.com/FusionAuth/fusionauth-issues/issues/2236

nikos avatar Jun 30 '20 13:06 nikos

If you're not using Elasticsearch, the ES Query String DSL does not work. Your query should be queryString = "Joe".

There is also a doc bug, those field should be lastName, firstName and fullName. But the note in the doc is only meant to say those are the only fields that will be searched.

robotdan avatar Jun 30 '20 14:06 robotdan

@mooreds do you want to take a look at any of our search doc and see if we need to clarify how to use the search APIs when not using Elasticsearch?

In this case @nikos was using fullName:Joe - but we don't support this DSL without Elasticsearch. This may not be clear in our documentation. The doc is intended to indicate when you search with a string, we will compare the documented fields, but you can't query on them directly using the fullName: notation.

robotdan avatar Jun 30 '20 14:06 robotdan

@robotdan Thanks for coming back so quickly to my question.

I am a bit confused, since I thought FusionAuth is currently only available with ElasticSearch as search backend (at least the last time I tried it, this might be back to 1.11? starting was unhappy with ES missing). So we are using an ElasticSearch full text index, the mapping looks like:

{
    "fusionauth_user": {
        "mappings": {
            "_doc": {
                "_source": {
                    "enabled": false
                },
                "properties": {
                    "active": {
                        "type": "boolean"
                    },
                    "birthDate": {
                        "type": "date"
                    },
                    "data": {
                        "properties": {
                            "email": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "emobilityId": {
                                "type": "long"
                            },
                            "mainMandant": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "technicalUser": {
                                "type": "boolean"
                            }
                        }
                    },
                    "email": {
                        "type": "text",
                        "analyzer": "exact_lower",
                        "fielddata": true
                    },
                    "fullName": {
                        "type": "text",
                        "fielddata": true
                    },
                    "id": {
                        "type": "keyword"
                    },
                    "insertInstant": {
                        "type": "date"
                    },
                    "lastLoginInstant": {
                        "type": "date"
                    },
                    "login": {
                        "type": "keyword"
                    },
                    "memberships": {
                        "properties": {
                            "data": {
                                "type": "object"
                            },
                            "groupId": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "id": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "insertInstant": {
                                "type": "long"
                            },
                            "userId": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            }
                        }
                    },
                    "registrations": {
                        "type": "nested",
                        "include_in_parent": true,
                        "properties": {
                            "applicationId": {
                                "type": "keyword"
                            },
                            "data": {
                                "type": "object"
                            },
                            "id": {
                                "type": "keyword"
                            },
                            "insertInstant": {
                                "type": "date"
                            },
                            "lastLoginInstant": {
                                "type": "date"
                            },
                            "preferredLanguages": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "roles": {
                                "type": "keyword"
                            },
                            "tokens": {
                                "type": "object"
                            },
                            "username": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "usernameStatus": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "verified": {
                                "type": "boolean"
                            }
                        }
                    },
                    "tenantId": {
                        "type": "keyword"
                    },
                    "timezone": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "username": {
                        "type": "text",
                        "fielddata": true
                    },
                    "verified": {
                        "type": "boolean"
                    }
                }
            }
        }
    }
}

So it seems like for example fullName is fine, but there are no fields for firstName and lastName: are those hitting the DB first by the Search API endpoint before falling back to the ES full-text search capabilites? Another strange thing: searching for lastName works fine, but when specifying a value for username or firstName all users are returned no matter what the search value looks like?

Sorry for bringing up many questions at once ;-)

nikos avatar Jun 30 '20 14:06 nikos

@nikos as of release 1.16, elasticsearch is optional. More details here: https://fusionauth.io/docs/v1/tech/release-notes#version-1-16-0-rc-1 ("Support for using the database as the user search engine. ")

Here's a doc about switching between them: https://fusionauth.io/docs/v1/tech/tutorials/switch-search-engines

mooreds avatar Jun 30 '20 15:06 mooreds

Regarding your questions about the mapping, this is working as designed. I think the docs need make it clear that the firstName and lastName specific field searches only work with the database search engine (though of course we could change that, if it is important to you, please file a feature request).

are those hitting the DB first by the Search API endpoint before falling back to the ES full-text search capabilites

There's no dependencies between the engines--if you are using elasticsearch, it is getting the whole query; the same is true with the database search engine. (Except for if you are searching only by user id.)

but when specifying a value for username or firstName all users are returned no matter what the search value looks like?

How are you building those queries? Can you provide examples? It looks like the admin UI just uses a query_string when searching on username and doesn't specify the actual field.

Hope this helps.

mooreds avatar Jun 30 '20 15:06 mooreds

Thanks @mooreds for taking action and clarifying on the different aspects regarding database and fulltext/ES search capabilities (side note: I would prefer if the search client does not have to distinct between the search engines, but can use the same field names).

Coming back to my original question: when making use of the FusionAuth Java client API (1.15.4) against an FusionAuth+ES (1.15.5) server, I was really confused that searching on the lastName does return the expected users (for my Kotlin example, please see my original posting above), but searching for firstName does return all users. Note that both fields seem not be explictliy mapped into ES documents (see mapping).

What do you think could gone wrong in my case? It is expected that the search value is put into double or single quotes? How are a white space supposed to be supported: firstName:'Joe Foo' ?

nikos avatar Jun 30 '20 19:06 nikos

side note: I would prefer if the search client does not have to distinct between the search engines, but can use the same field names

Hmmm. What do you mean? You can use queryString for both, it just has different limitations. I'm not sure what you mean.

Note that both fields seem not be explictliy mapped into ES documents (see mapping).

I only see fullName in the mapping. What am I missing?

We don't create a field on firstName or lastName, but you can use wildcards to search on them in the queryString:

queryString = "Joe*"

queryString = "*Smith"

I realize that doesn't quite get you what you want, though. However, we can keep this open as a feature request to index firstName and lastName.

mooreds avatar Jun 30 '20 19:06 mooreds

This is working when querying users with the Java client API: queryString = "firstName:Joe" as opposed to queryString = "lastName:Smith" (of course just sample values, in reality those are matching to existing users and their given last resp. first names ;-)

nikos avatar Jun 30 '20 21:06 nikos

For an application developer it should not make a difference if the search is using database or elasticsearch capabilities regarding the names of the fields, or which do only exist if the search engine type is database.

nikos avatar Jun 30 '20 21:06 nikos

Added some more general questions on how to use the user search in detail over at https://github.com/FusionAuth/fusionauth-site/pull/118#issuecomment-652205369

nikos avatar Jul 01 '20 05:07 nikos

I revised my code and currently use the query string fullName:Joe* as first name equivalent and fullName:*Smith as search for the last name, until hopefully first and last name will be supported also for ElasticSearch as search engine.

nikos avatar Jul 01 '20 07:07 nikos

Internal: Any reason not to index each of these fields?

  • firstName
  • middleName
  • lastName

Currently we are indexing a single field called fullName which is built from fullName if provided, or it is built using a combination of firstName, middleName and lastName.

robotdan avatar Jul 01 '20 14:07 robotdan

I can't think of any reason not to index these fields. Seems like a good move to me from the user perspective.

However, if I were implementing, I'd consider:

  • additional time to index
  • additional memory/space constraints

I am afraid I don't know enough about the internals to have a valid opinion on that stuff.

mooreds avatar Jul 01 '20 14:07 mooreds

I believe we wrote the original indexing to use fullName and let Elastic handle that as more of a document than a single value. Elastic should tokenize and make each piece of the name searchable. The only issue will be when different name components are the same. Like a lastName of John. Or a middle name of Smith.

voidmain avatar Jul 01 '20 14:07 voidmain

@nikos I just pushed a fix for the search engine documentation which hopefully makes things more clear and addresses your other questions. https://fusionauth.io/docs/v1/tech/apis/users#search-for-users

mooreds avatar Jul 15 '20 15:07 mooreds

Removed documentation tag as this seems to be a bug about indexing first/last name now.

mooreds avatar Sep 20 '21 20:09 mooreds

I am leaving a comment here for future design discussions.

The sorting currently supported on a field like fullName is the same sort behavior that ES offers for a text field mapping.

In other words, if you have

Jim D
Jim B
Jim A
Jim C

and asked for a sort on this field, due to collisions of jim the sort behavior would be non-deterministic and would not sort as such

Jim A
Jim B
Jim C
Jim D

Likewise, if you had something like this

Admin Zot
Fred Bunk
Becky Beu

It might sort to

Admin Zot 
Becky Beu
Fred Bunk

Some folks might want sorting based on an ExactMatch. We would have to update our ES mappings and schema approach to have this behavior.

jobannon avatar May 02 '23 21:05 jobannon

Note that in version 1.49.0, we will support username.exact and fullName.exact fields for more precise searching of these properties.

andrewpai avatar Mar 06 '24 00:03 andrewpai