onestop
onestop copied to clipboard
GCMD Keyword Verification at indexing instead of search
Currently, a search request is sent, aggregations are built, and only ones that pass the "top level keywords" check for GCMD science/locations are sent in the response from the API.
In order to keep our data cleaner AND reduce pointless aggregations when searching, we should move this keyword verification from the ElasticsearchService (search response construction) to the ETL service when data is mapped from Staging to Search.
Likewise, there is another hierarchy keyword type that should be checked here (the field is 'gcmdScienceServices'). As of 1/31/18, the top level keywords map should be:
private static final topLevelKeywords = [
'science' : [
'Agriculture', 'Atmosphere', 'Biological Classification', 'Biosphere', 'Climate Indicators',
'Cryosphere', 'Human Dimensions', 'Land Surface', 'Oceans', 'Paleoclimate', 'Solid Earth',
'Spectral/Engineering', 'Sun-Earth Interactions', 'Terrestrial Hydrosphere'
],
'location': [
'Continent', 'Geographic Region', 'Ocean', 'Solid Earth', 'Space', 'Vertical Location'
],
'service': [
'Data Analysis And Visualization', 'Data Management/Data Handling', 'Education/Outreach', 'Environmental Advisories',
'Hazards Management', 'Metadata Handling', 'Models', 'Reference And Information Services', 'Web Services'
]
]
TODO: Confirm that this is being handled by the the indexer application