TheHive icon indicating copy to clipboard operation
TheHive copied to clipboard

[Enhancement] Improve search

Open To-om opened this issue 2 years ago • 3 comments

Request Type

Enhancement

Feature Description

Goal:

  • [x] Add a global search in full-text (search in all textual fields).
  • [ ] Give to user more control on how the search is done (exact search or full search)
  • [ ] Make search on custom fields efficient
  • [ ] Improve performance on dashboards

Custom fields

The custom field value are now indexed. An id (_id) has been added to the value of the custom fields. This is can be used to update or remove a value. This will permit de have multiple value for the same custom field.

API v0 changes

The API v0 has been enrich with some extra fields in output results. For cases, case templates and alerts, the format for custom fields becomes:

{
    "customFields": {
        cfName: {
            cfType: cfValue,
            "order": 0,
            "_id": "~111" // id of the custom field value
        },
        cfName: {...}
    }
}

The id can be used to identify the custom field to update. This should no be a breaking change. With this format of custom fields, only one value is returned. If several values are present:

  • only one value is returned for a custom field (without consistent choice.
  • when the name is used to update a custom field, only one value is update.

API v1 changes

The id of the custom field value has been added in v1 too:

{
  "customFields": [
    {
      "_id": "~111",
      "name": ""
      "description": ""
      "type": ""
      "value": ""
      "order": 0
    }
  ]
}

[BREAKING CHANGE] The value of a custom field cannot be identify by the custom fiel any more. The id of the value must be used. This change impacts the following APIs:

  • DELETE /api/v1/case/customField/$customFieldValueId
  • PATCH /api/v1/alert/$alertId
  • PATCH /api/v1/case/$case
  • PATCH /api/v1/caseTemplate/$caseTemplateId

Example of custom field update in a case:

PATCH /api/v1/case/~123

{
  "customFields.~234": "new value"
}

If the name of the custom field is used in a patch, a new custom field is added. The order can be updated with { "customFields.~234": {"value": "new value", "order": } }

Search API

The global search API are:

POST /api/v1/search  // for v1 objects
  query: String      // only one term (word) is currently supported
  from:  Option[Int] // paginate the result
  to:    Option[Int] // paginate the result

POST /api/search     // for v0 objects
  query: String      // only one term (word) is currently supported
  from:  Option[Int] // paginate the result
  to:    Option[Int] // paginate the result

The output is an array of TheHive object (the field _type contains its type).

Chart API

The time aggregation API are:

POST /api/v1/chart/time // for v1 objects
  interval: String      // time interval (1d, 3M, ...). Implemented units are s, m, h, d, w, M and y
  from:     Option[Long]        // starting date  
  to:       Option[Long]        // ending date
  aggs:     Seq[Series] // list of series

POST /api/chart/time // for v0 objects
  interval: String      // time interval (1d, 3M, ...). Implemented units are s, m, h, d, w, M and y
  from:     Option[Long]        // starting date  
  to:       Option[Long]        // ending date
  aggs:     Seq[Series] // list of series

A series is an object containing:

  name: String      // the name of the series
  model: String     // the model on which the series is applied (Case, Alert, Observable, ...)
  dateField: String // the name of the field used for time range
  agg: Aggregation  // the aggregation to applied for each time bucket

An aggregation is one of the following:

  • count:
    _agg = "count"
    _name: Option[String] // name of the aggregation
    _query: Filter        // the filter applied before counting
  • min:
    _agg = "min"
    _name: Option[String] // name of the aggregation
    _field: String        // name of the field on which the minimum is done
    _query: Filter        // the filter applied before computing the minimum
  • max:
    _agg = "max"
    _name: Option[String] // name of the aggregation
    _field: String        // name of the field on which the maximum is done
    _query: Filter        // the filter applied before computing the maximum
  • avg:
    _agg = "avg"
    _name: Option[String] // name of the aggregation
    _field: String        // name of the field on which the average is done
    _query: Filter        // the filter applied before computing the average
  • sum:
    _agg = "sum"
    _name: Option[String] // name of the aggregation
    _field: String        // name of the field on which the sum is done
    _query: Filter        // the filter applied before computing the sum
  • time:
    _agg = "time"
    _name: Option[String]      // name of the aggregation
    _field: String/Seq[String] // name of the field on which a time aggregation is done
    _query: Filter             // the filter applied before computing the average
    _interval: String          // time interval
    _select: Seq[Aggregation]  // aggregations to applied for each time bucket
  • field:
    _agg = "field"
    _name: Option[String]      // name of the aggregation
    _field: String/Seq[String] // name of the field on which a field aggregation is done
    _query: Filter             // the filter applied before computing the average
    _select: Seq[Aggregation]  // aggregations to applied for each different value of field
    _order: String/Seq[String] // name of field (or "count") used to order (can be prefixed by +/-)
    _size: Long                // maximum number of different values

The output of time chart is an ordered array of object that contains _key: Date (The date of the time bucket) and aggName: aggValue for each requested aggregation. For example:

POST /api/v1/chart/time
{
  "interval": "1d",
  "from": 1626048000000,
  "to": 1626998400000,
  "aggs": [
    {
      "name": "CaseCount",
      "model": "Case",
      "dateField": "_createdAt",
      "agg": {"_agg": "count"}
    },
    {
      "name": "AlertCount",
      "model": "Alert",
      "dateField": "_createdAt",
      "agg": {"_agg": "count"}
    }
  ]
}

returns

[
  { "_key": 1626048000000, "CaseCount": 10, "AlertCount": 25 },
  { "_key": 1626134400000, "CaseCount": 15, "AlertCount": 27 },
  { "_key": 1626220800000, "CaseCount": 30, "AlertCount": 29 },
  { "_key": 1626307200000, "CaseCount": 20, "AlertCount": 45 },
  { "_key": 1626393600000, "CaseCount": 45, "AlertCount": 68 },
  { "_key": 1626480000000, "CaseCount": 12, "AlertCount": 54 },
  { "_key": 1626566400000, "CaseCount": 11, "AlertCount": 34 },
  { "_key": 1626652800000, "CaseCount": 14, "AlertCount": 56 },
  { "_key": 1626739200000, "CaseCount": 34, "AlertCount": 23 },
  { "_key": 1626825600000, "CaseCount": 12, "AlertCount": 48 },
  { "_key": 1626912000000, "CaseCount": 45, "AlertCount": 47 },
  { "_key": 1626998400000, "CaseCount": 23, "AlertCount": 53 }
]

To-om avatar Jul 26 '21 09:07 To-om

Hey @To-om - Will this issue resolve the problems in the Search section of TheHive for Observables?

In 4.1.11 and 4.1.12 when you search by dataType it will say Search Result 20 records(s) found and then show no data on either page:

image

Also, when you add another dataType the data still doesn't exist on page 1 but does on page 3: image

I can create a separate issue if it is not tied to this issue, please let me know!

nicpenning avatar Oct 29 '21 15:10 nicpenning

I can't reproduce your problem. Please create a new issue and provide some logs (if any).

To-om avatar Oct 29 '21 16:10 To-om

Thanks, I will do that!

nicpenning avatar Oct 29 '21 16:10 nicpenning