Shadowrocket-ADBlock-Rules
                                
                                
                                
                                    Shadowrocket-ADBlock-Rules copied to clipboard
                            
                            
                            
                        QA Chinese model result does not match python version
Using this Chinese model This model runs on python locally, the output is correct, but from spaGO is not.
Similar to  #101, but I cannot find the bool parameter for QA,
how to turn off output is forced to be a distribution (sum must be 1), whereas with Python, the output is free?
server := bert.NewServer(model)
answers := s.model.Answer(body.Question, body.Passage)
Translated QA: Context: My name is Clara, I live in Berkeley Q: what is my name? A: Clara
Output is supposed to be
克拉拉
but got
{
    "answers": [
        {
            "text": "我叫克拉拉,我住在伯克利。",
            "start": 0,
            "end": 13,
            "confidence": 0.2547743
        },
        {
            "text": "住在伯克利。",
            "start": 7,
            "end": 13,
            "confidence": 0.22960596
        },
        {
            "text": "我叫克拉拉,我住",
            "start": 0,
            "end": 8,
            "confidence": 0.1548344
        }
    ],
    "took": 1075
}
./bert-server server --repo=~/.spago --model=luhua/chinese_pretrain_mrc_roberta_wwm_ext_large --tls-disable
PASSAGE="我叫克拉拉,我住在伯克利。"                                                                                                                                                 
QUESTION1="我的名字是什么?" 
curl -k -d '{"question": "'"$QUESTION1"'", "passage": "'"$PASSAGE"'"}' -H "Content-Type: application/json" "http://127.0.0.1:1987/answer?pretty"
                                    
                                    
                                    
                                
Thanks @Tonghua-Li to experiment spaGO on Chinese models! Let me take a look.
In the meantime, did you check already if the output from the tokenization matches the one in Python/Rust?
@matteo-grella , I am new to NLP and tensorflow.
Here is the comparison between spago and python, lengths are different. Anything else I should check?
spago
startLogits, length= 13
endLogits length = 13 
python
startLogits, length= 128 
[7.526767253875732, -11.148331642150879, -11.519922256469727, -11.718563079833984, -11.908085823059082, -12.035258293151855, -11.453763008117676, -11.869075775146484, -11.3169527053833, -10.268500328063965, -10.09360408782959, -11.069799423217773, -6.754544734954834, -11.262563705444336, ...]
endLogits length= 128 
[8.138846397399902, -11.649543762207031, -11.485342025756836, -11.634878158569336, -11.620648384094238, -11.807802200317383, -11.955061912536621, -11.538698196411133, -10.995415687561035, -10.638959884643555, -11.329093933105469, -10.720544815063477, -11.54542350769043, -9.438825607299805, ...]