Shadowrocket-ADBlock-Rules QA Chinese model result does not match python version

Using this Chinese model This model runs on python locally, the output is correct, but from spaGO is not.

Similar to #101, but I cannot find the bool parameter for QA, how to turn off output is forced to be a distribution (sum must be 1), whereas with Python, the output is free?

server := bert.NewServer(model)
answers := s.model.Answer(body.Question, body.Passage)

Translated QA: Context: My name is Clara, I live in Berkeley Q: what is my name? A: Clara

Output is supposed to be 克拉拉 but got

{
    "answers": [
        {
            "text": "我叫克拉拉，我住在伯克利。",
            "start": 0,
            "end": 13,
            "confidence": 0.2547743
        },
        {
            "text": "住在伯克利。",
            "start": 7,
            "end": 13,
            "confidence": 0.22960596
        },
        {
            "text": "我叫克拉拉，我住",
            "start": 0,
            "end": 8,
            "confidence": 0.1548344
        }
    ],
    "took": 1075
}

./bert-server server --repo=~/.spago --model=luhua/chinese_pretrain_mrc_roberta_wwm_ext_large --tls-disable

PASSAGE="我叫克拉拉，我住在伯克利。"                                                                                                                                                 
QUESTION1="我的名字是什么？" 
curl -k -d '{"question": "'"$QUESTION1"'", "passage": "'"$PASSAGE"'"}' -H "Content-Type: application/json" "http://127.0.0.1:1987/answer?pretty"

Jan 02 '22 16:01 Tonghua-Li

Thanks @Tonghua-Li to experiment spaGO on Chinese models! Let me take a look.

In the meantime, did you check already if the output from the tokenization matches the one in Python/Rust?

Jan 03 '22 15:01 matteo-grella

@matteo-grella , I am new to NLP and tensorflow.

Here is the comparison between spago and python, lengths are different. Anything else I should check?

spago

startLogits, length= 13
endLogits length = 13

python

startLogits, length= 128 
[7.526767253875732, -11.148331642150879, -11.519922256469727, -11.718563079833984, -11.908085823059082, -12.035258293151855, -11.453763008117676, -11.869075775146484, -11.3169527053833, -10.268500328063965, -10.09360408782959, -11.069799423217773, -6.754544734954834, -11.262563705444336, ...]
endLogits length= 128 
[8.138846397399902, -11.649543762207031, -11.485342025756836, -11.634878158569336, -11.620648384094238, -11.807802200317383, -11.955061912536621, -11.538698196411133, -10.995415687561035, -10.638959884643555, -11.329093933105469, -10.720544815063477, -11.54542350769043, -9.438825607299805, ...]

Jan 05 '22 01:01 Tonghua-Li

Shadowrocket-ADBlock-Rules Shadowrocket-ADBlock-Rules copied to clipboard

QA Chinese model result does not match python version

Shadowrocket-ADBlock-Rules
Shadowrocket-ADBlock-Rules copied to clipboard