ansj_seg icon indicating copy to clipboard operation
ansj_seg copied to clipboard

solr7 如何使用 org.ansj.solr.AnsjTokenizerFactory 加载同义词词典

Open ekoz opened this issue 5 years ago • 0 comments

ansj_solr6_plugin 里面的 AnsjTokenizerFactory 的 create 方法

public Tokenizer create(AttributeFactory factory) {
    if (isQuery == true) {
            return new AnsjTokenizer(new ToAnalysis(), filter);
        } else {
            return new AnsjTokenizer(new IndexAnalysis(), filter);
        }
    }

ansj_lucene7_plugin 里面的 AnsjTokenizer 类的构造方法

public AnsjTokenizer(Analysis ta, List<StopRecognition> stops, List<SynonymsRecgnition> synonyms) {
    this.ta = ta;
        this.stops = stops;
        this.synonyms = synonyms;
    }

ansj_lucene7_plugin

问题

  • 如果直接调用肯定会报错,缺少第三个参数 List<SynonymsRecgnition> synonyms
  • AnsjTokenizerFactory 类里面是不是应该加上private List<SynonymsRecgnition> synonyms;
  • 如何添加 List<SynonymsRecgnition>

期望效果

schema 中的配置样例:

<fieldType name="text_ansj" class="solr.TextField" positionIncrementGap="100">  
      <analyzer type="index">
          <tokenizer class="org.ansj.solr.AnsjTokenizerFactory" type="index_ansj" isQuery="false" stopwords="/path/to/stop.dic"/>
          <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
      <analyzer type="query">
          <tokenizer class="org.ansj.solr.AnsjTokenizerFactory" type="query_ansj" stopwords="/path/to/stop.dic" synonyms="/path/to/synonyms.dic"/>
          <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

ekoz avatar Mar 15 '19 05:03 ekoz