documentation-website icon indicating copy to clipboard operation
documentation-website copied to clipboard

[DOC-META] Text analysis content needed

Open alicejw1 opened this issue 3 years ago • 4 comments

This meta issue indicates the new pages we need for each of the text analyzers we are currently missing. Note: Language analyzer is documented, and the concepts page: Optimizing text for searches with text analyzers

Analyzers (10)

  • [ ] Standard analyzer
  • [ ] Simple
  • [ ] Whitespace
  • [ ] Stop
  • [ ] Keyword
  • [ ] Pattern
  • [ ] Fingerprint
  • [ ] Custom
  • [ ] Stemming
  • [ ] Token graphs

Language analyzers (24)

  • [ ] A page for each language - 24 total. See Language analyzer section currently on concepts page.

Tokenizers (14 + index page)

  • [ ] Index page
  • [ ] Character group
  • [ ] Classic
  • [ ] Edge n-gram
  • [ ] Keyword
  • [ ] Letter
  • [ ] Lowercase
  • [ ] N-gram
  • [ ] Path hierarchy
  • [ ] Pattern
  • [ ] Simple pattern
  • [ ] Simple pattern split
  • [ ] Standard
  • [ ] Thai
  • [ ] UAX URL email
  • [ ] Whitespace

Token filters (48)

  • [ ] Page for each one

Character filters (3 + index page)

  • [ ] index page
  • [ ] HTML strip
  • [ ] Mapping
  • [ ] Pattern replace

Normalizers

  • [ ] Normalizers

alicejw1 avatar Oct 07 '22 19:10 alicejw1

hi @kolchfa-aws , Heather asked me to reassign this ticket to you. Thanks

alicejw1 avatar Oct 27 '22 16:10 alicejw1

Missing: Stemming, token graphs, language analyzers for each language, configuring built-in analyzers, custom analyzers, built-in analyzer reference (8 pages), tokenizers, tokenizer reference (15 pages), token filter reference (48 pages), character filter reference (3 pages), normalizers

hdhalter avatar Aug 14 '23 23:08 hdhalter

I'm working on Standard analyzer. cc @hdhalter

leanneeliatra avatar Jul 22 '24 15:07 leanneeliatra

I'm going to work on

  1. Normalizers https://github.com/opensearch-project/documentation-website/pull/8192
  2. Character filters (3 + index page) https://github.com/opensearch-project/documentation-website/pull/8206

leanneeliatra avatar Sep 06 '24 09:09 leanneeliatra