librime-lua icon indicating copy to clipboard operation
librime-lua copied to clipboard

使用 lua 添加的候选词在选中后不会进入用户词表中

Open wudanyang6 opened this issue 3 years ago • 10 comments
trafficstars

预期:使用 lua 产生的候选词,上屏后能够进入用户词表中,同样地 input 下次要优先召回 lua 的候选词 现状:选择 lua 产生的候选词之后,用户词表中没有 lua 产生的候选词

wudanyang6 avatar Jan 10 '22 06:01 wudanyang6

比如,我使用 lua 脚本在候选词尾部是 “是” 这个字时,新添加了一个候选词:

local function add_shi(input)
   -- 使用 `iter()` 遍历所有输入候选项
   for cand in input:iter() do
      yield(cand)
      
      local newCandStr = ""
      if string.sub(cand.text, -string.len("是")) == "是" then
         newCandStr = string.gsub(cand.text, "是", "时")
         yield(Candidate("shi", cand.start, cand._end, newCandStr, cand.comment))
      end

   end
end

-- 将上述定义导出
return add_shi

wudanyang6 avatar Jan 10 '22 06:01 wudanyang6

code 看起來沒有問題,你的方案是如何配置的?? rime.lua and schema
librime 是否安装librime-lua版
查看 rime log

shewer avatar Jan 10 '22 07:01 shewer

-- <user_data_dir>/lua/add_shi.lua 
local function add_shi(input)
   for cand in input:iter() do 
      yield(cand)

      if  cand.text:match("是$") then 
         local text =  cand.text:gsub("是$","时")  -- 替换字尾
         yield(Candidate("shi", cand.start, cand._end, text, cand.comment))
      end 
   end
end 


return add_shi
--[[
rime.lua 
add_shi= require 'add_shi'

-- schema  or  custom 
insert  "lua_filter@add_shi"   in  engine/filters    

--- rime_api_console test
schema: whaleliu_ext / 【鯨舞倉】詞
status: composing
[(鯨)日人]|
page: 1  (of size 5)
1. [是]|〈日人〉鯨
2.  时 |〈日人〉鯨-------shi
3.  日人 |〈日人〉鯨
4.  匙 〈~心〉鯨|〈日人心〉鯨
5.  昨 〈~尸〉鯨|〈日竹尸 日人尸〉鯨

in add_shi -----------
commit: 是
schema: whaleliu_ext / 【鯨舞倉】詞
status: composing
[不是[聯]]|
page: 1  (of size 5)
1. [不是](聯)|〈一火日人〉鯨
2.  不时 (聯)|〈一火日人〉鯨-------shi
3.  否 (聯)|〈火口〉鯨
4.  個 (聯)|〈人田虫〉鯨
5.  啊 (聯)|〈口弓口〉鯨

in add_shi -----------
commit: 不是
schema: whaleliu_ext / 【鯨舞倉】詞
status: composing
[很[聯]]|
page: 1  (of size 5)
1. [很](聯)|〈竹人日女〉鯨
2.  說 (聯)|〈一金山〉鯨
3.  我 (聯)|〈戈〉鯨
4.  一個 (聯)|〈一人虫 一人口〉鯨
5.  你 (聯)|〈人弓火〉鯨

schema: whaleliu_ext / 【鯨舞倉】詞
status: composing
[(鯨)日人一火人]|
page: 1  (of size 5)
1. [是不是]|〈日人一火人〉鯨
2.  是不时 |〈日人一火人〉鯨-------shi
3.  dbefb [english]|
4.  dbe fb [ninja]|
5.  FB (Ninja) 消防隊, 水上飛機, 飛船, 運貨單, 後衛\n [計]; 文件塊, 固定塊|
--]]

shewer avatar Jan 10 '22 08:01 shewer

感謝您的回覆與優化代碼

在您回覆之前,經過一些配置之後,已經實現了在候選词中顯示【是不是】和【是不时】

但是,我選擇 【是不时】上屏之后,這個詞組沒有被 rime 记录,下次打 【shibushi】 還是會先出現【是不是】再出現【是不时】

補充一下我的環境: 計算機:MacBook Pro (13-inch, M1, 2020) 操作系統:macOS Big Sur 11.6.2
軟體版本:鼠鬚管 0.15.2 (最新版)

rime-lua 應該沒有問題,因爲已經可以修改候選词了

rime.lua add_shi= require 'add_shi'

-- double_pinyin.custom.yaml

# Rime schema
# encoding: utf-8
patch:
  schema/name: 自然码
  switches:
    - name: ascii_mode
      reset: 0
      states: [ 中文, 西文 ]
    - name: emoji_suggestion
      reset: 0
      states: [ "No", "Yes" ]
    - name: full_shape
      states: [ 半角, 全角 ]
    - name: simplification
      reset: 1
      states: [ 漢字, 汉字 ]
    - name: ascii_punct
      states: [ 。,, ., ]
  engine/translators:
    - punct_translator
    - lua_translator@date_translator
    - lua_translator@week_translator
    - lua_translator@time_translator
    - lua_translator@number_translator
    - lua_translator@reverse_lookup_filter
    - script_translator
    - table_translator@custom_phrase
  engine/filters:
    - lua_filter@add_shi
    - simplifier@emoji_suggestion
    - simplifier
    - uniquifier
    #- charset_filter@gbk
    #- single_char_filter
  engine/processors:
    - ascii_composer
    - recognizer
    - key_binder
    - speller
    - punctuator
    - selector
    - navigator
    - express_editor
  engine/segmentors:
    - ascii_segmentor
    - matcher
    - abc_segmentor
    - punct_segmentor
    - fallback_segmentor
  emoji_suggestion:
    opencc_config: emoji.json
    option_name: emoji_suggestion
    # tips: all
  
  #載入朙月拼音擴充詞庫
  "translator/dictionary": luna_pinyin.extended
  'translator/preedit_format': {}
  # translator/enable_correction: true

  # 自定义符号上屏
  punctuator:
    import_preset: symbols
    # 自定义快捷符号输入
    # symbols:
    #   "/fs": [½, ‰, ¼, ⅓, ⅔, ¾, ⅒ ]
    half_shape:
      "#": "#"
      "*": "*"
      "`": "`"
      "~": "~"
      "@": "@"
      "=": "="
      "/": ["/", "÷"]
      '\': "、"
      "_" : "──"
      "'": {pair: ["「", "」"]}
      "[": "【"
      "]": "】"
      "$": ["¥", "$", "€", "£", "¢", "¤"]
      "<": ["《", "〈", "«", "<"]
      ">": ["》", "〉", "»", ">"]

  recognizer/patterns/punct: "^/([0-9]0?|[A-Za-z]+)$"

  ### 双拼使用自定义词典 custom_phrase.txt
  custom_phrase:
    dictionary: ""
    user_dict: custom_phrase
    db_class: stabledb
    enable_completion: true
    enable_sentence: true
    initial_quality: 1
  "engine/translators/@5": table_translator@custom_phrase
# Rx: BlindingDark/rime-easy-en:customize:schema=double_pinyin_flypy
# 若要启用 easy_en,取消注释下面两行
  __include: easy_en:/patch
  easy_en/enable_sentence: false # 中英文混输的设置
# Rx: lotem/rime-octagram-data:customize:schema=luna_pinyin,model=hans
  __include: grammar:/hant
# Rx: BlindingDark/rime-lua-select-character:customize:schema=luna_pinyin
  # __include: lua_select_character:/patch # 需要lua_selector打开本行注释

wudanyang6 avatar Jan 10 '22 10:01 wudanyang6

我这边也是,lua处理过的候选就不会被用户词典记录

hehuanxuancao avatar Jan 10 '22 10:01 hehuanxuancao

这部份涉略不深 只知遒是在 MemoryReg PhraseReg class stript_translator public memory , translator , translator_option Memory 是管理dictionary and user dict Memory:OnCommit <-- Memory:Memorize 存入user_dict

translator https://github.com/hchunhui/librime-lua/blob/8b37d5541b0341a07ed7517c78df1a09268001fd/src/types.cc#L1299 script_translator https://github.com/rime/librime/blob/e43a6ef36758862ab5a590383d89402518cd8a50/src/rime/gear/script_translator.h#L28 memory OnCommit https://github.com/rime/librime/blob/e43a6ef36758862ab5a590383d89402518cd8a50/src/rime/gear/memory.cc#L103

你可以先试试 cand:and:get_genuine()

-- Candidate -->  SimpleCandidate 
--  input:iter()   --> 或许是Phrase 
for cand in input:iter() do 
   yield(cand)
   if cand.text:match("是$") then
      local new_cand=cand:get_genuine()
       new_cand.text= cand.text:gsub("是$","时") -- 可能是clone cand  
       yield(new_cand)
    end
end


再者 此 filter 是 遇到 "是$" 就一定送上一佪cand,如果 user_dict 成功 就会多一佪 可能要放在 uniquifier 前,以便过滤相同的 cand.text

shewer avatar Jan 10 '22 12:01 shewer

赋值后, new_cand.text 没有被改成 “时”

wudanyang6 avatar Jan 12 '22 15:01 wudanyang6

嗯 input:iter() cand 是 Phrase 無法修改 cand.text

那就從 https://github.com/LEOYoon-Tsaw/Rime_collections/blob/master/Rime_description.md#%E4%B8%89translator 1~10[項關於造詞選項測試那個較適合

要動MemoryReg 可以看MemoryReg的範例 自行製作 lua_translator 直接調用 dictionary

shewer avatar Jan 13 '22 01:01 shewer

https://github.com/hchunhui/librime-lua/pull/80 这个提交看着好像可以修改 memory 和 user dict

但是 mac 版本的鼠须管还不支持 MemoryReg

wudanyang6 avatar Jan 19 '22 07:01 wudanyang6

按照 mac 版rime 的install.md 重編 , mac 版 2021/02 不可能支援 #80 後大約在2021/04 更新的。~~

shewer avatar Jan 19 '22 10:01 shewer