rat-sql
rat-sql copied to clipboard
Asking for pre-trained model
Hi,
Can someone share with us its best pre-trained model (the best model_checkpoint) ?
Thank you all.
I have a bert-model trained on Spider that I can share.
Ok @kalleknast can you share it with me, please?
You can get it here. Please reply when you have downloaded it, so that I can remove it from google drive.
thank you @kalleknast very much
@mellahysf @kalleknast Hi! I am also looking for the pre-trained models but I can't find them. It seems they have not been published, am I right? If you have a well-working trained model, could you share it with me too? Especially the Glove model on Spider. Thank you so much!
I don't have Glove on Spider.
@kalleknast Bert is also ok if CPU is enough for inferencing. Thank you.
@kalleknast @mellahysf Can you please share the pre-trained model for BERT with me too?
See issue #32. Post a message when you've downloaded it (so that I can remove it since it is taking up a lot of space on my google drive).
I think that specific model was giving a 60% accuracy. Am I right @kalleknast ? Thank you so much for sharing it.
I think so. It's been some time since I checked the performance. Definitively not 65% or whatever the SOTA is.
Hi, I tried running the model but I am getting this error:
RuntimeError: Error(s) in loading state_dict for EncDecModel: size mismatch for decoder.rule_logits.2.weight: copying a param with shape torch.Size([94, 128]) from checkpoint, the shape in current model is torch.Size([97, 128]). size mismatch for decoder.rule_logits.2.bias: copying a param with shape torch.Size([94]) from checkpoint, the shape in current model is torch.Size([97]). size mismatch for decoder.rule_embedding.weight: copying a param with shape torch.Size([94, 128]) from checkpoint, the shape in current model is torch.Size([97, 128]).
Did you change the architecture kalleknast ? Or do you have any idea on how to solve that ?
@Muradean The model was trained in October last year. I haven't trained and uploaded any other rat-sql model since then.
The error could be due to a mismatch of the decoder vocabulary. I'm guessing that the model was trained with a decoder vocabulary of size 94, but a vocab of size 97 is expected. It may be due to some change to the Spider dataset (i.e. the addition of three new tokens) that occurred after the model was trained.
I trained 2 models with and without an expanded dataset, however, I think that I uploaded the model was trained on the original (unexpanded) Spider dataset. If I didn't, more people would have reported the same issue as you. I think the only solution is to train a new model. However, I dropped rat-sql for another project so I won't do it.
Thanks a lot for the reply and the clarification. I really appreciate it even though there seems to not be a quick fix for it.
If somebody could provide a new pre-trained model I would be very grateful.
Ok, after looking through the code I realized that the _fs parameter in rat-sql/configs/spider/nl2code-bert.jsonnet:
Is responsible for picking the .asdl file in rat-sql/ratsql/grammars/, which can either be:
-> Spider.asdl -> Spider_f1.asdl -> Spider_f2.asdl
By default when one git clones the repo and runs the Spider-Bert model the .asdl picked is Spider_f2.asdl which at the time has 97 rules, however, your model @kalleknast has 94 rules.
The number of rules generated from the .asdl, (where I am having the mismatch can be seen when you run the docker) in: /app/data/spider/nl2code,output_from=true,fs=2,emb=bert,cvlink/grammar_rules.json
I tried preprocessing step again, this time using Spider.asdl but the grammar_rules.json ends up having 103 rules (so it also gives me a mismatch error when performing inference).
Finally, I changed the _fs to pick Spider_f1.asdl and repeated the preprocessing step but the generated grammar had 0 rules... So in order to solve that I did a dirty quick fix and changed the Spider_f1.asdl name to Spider_f2.asdl and reset the _fs to 2. However, the generated grammar had 73 rules. Neither of these values, 73,97,103 matches the 94 rules. Do you remember doing anything else when training the unexpanded dataset?
Thanks
@Muradean
The model was trained with local _fs = 2;
:
local _base = import 'nl2code-base.libsonnet';
local _output_from = true;
local _fs = 2;
However, I checked grammar_rules.json
and see that it has 94 rules (len(data['all_rules'])
).
Spider_f2.asdl
seems to be from 11 of July, 2020.
First @kalleknast,
Thanks a lot for the reply.
Then, there must be something happening in my preprocessing stage that causes the grammar_rules to have 97 instead of 94.
Could you share your 'nl2code,output_from=true,fs=2,emb=bert,cvlink' directory please ?
That will allow me to see the grammar rules that are differing and other problems that might be going on.
Thanks again
@Muradean
You can get the nl2code,output_from=true,fs=2,emb=bert,cvlink
directory here.
I noticed that the actual model is not linked to in this thread. It is here in case it is still useful and someone wants it.
THANKS!
For anyone that might encounter this problem in the future, these were the 3 extra rules I had: ['table_unit*', 5] ['table_unit*', 6] ['table_unit*', 7]
I have a bert-model trained on Spider that I can share.
Could you please share the bert-model trained on Spider again? The link before is out of date. Thanks!
It is here. See the post from Feb 3, 2021. Unless I'm lost and you're talking about some other model.