docs
docs copied to clipboard
Fulltext case-sensitive index behavior
In Quick Start, the query:
SELECT
ts,
api_path,
log
FROM
app_logs
WHERE
matches(log, 'timeout');
shows results that are case-sensitive:
+---------------------+------------------+--------------------+
| ts | api_path | log |
+---------------------+------------------+--------------------+
| 2024-07-11 20:00:10 | /api/v1/billings | Connection timeout |
| 2024-07-11 20:00:10 | /api/v1/resource | Connection timeout |
+---------------------+------------------+--------------------+
2 rows in set (0.01 sec)
However, the table def is this:
Create Table: CREATE TABLE IF NOT EXISTS `app_logs` (
...
`log` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
...)
The docs for CREATE indicate that case_sensitive for FULLTEXT is true. Based on what I'm seeing, following Quick Start, the default is false.
In any event, the query behavior is case sensitive.
Issues as I see them:
- Possible error in either docs or implementation for default value of
case_sensitivefor fulltext index - Case-sensitive match behavior when schema shows
case_sensitiveto befalse
Thank you for your thorough review; the issue does indeed exist.
The specific reason is that the calculation for matches is separate between frontend and datanode. Datanode does respect the case-sensitive configuration, but this part has not yet been completed in frontend (see TODO): https://github.com/GreptimeTeam/greptimedb/blob/9c1704d4cbbfab8af07a77da598a1cfe2a5e7b22/src/common/function/src/scalars/matches.rs#L75-L95. As it stands, the implementation is currently case-sensitive.
Therefore, until this part of the work is completed, to maintain consistency, I think we can either hardcode this configuration to true and make it unchangeable, or hardcode it to false, but then change https://github.com/GreptimeTeam/greptimedb/blob/9c1704d4cbbfab8af07a77da598a1cfe2a5e7b22/src/common/function/src/scalars/matches.rs#L205 to use ilike, which would be more practical.
In any case, it was indeed an oversight, and I will arrange for a prompt fix.
cc @waynexia
The fulltext query statements have been updated, see https://docs.greptime.com/user-guide/logs/query-logs/#query-statements