gap-text2sql icon indicating copy to clipboard operation
gap-text2sql copied to clipboard

Generators used for pre-training

Open wiskojo opened this issue 4 years ago • 3 comments

The paper mentions the use of a SQL-to-Text and Table-to-Text model to generate synthetic samples for pre-training. I would like to use these models to try generate synthetic training examples for my own custom datasets. It doesn’t seem like the weights for these models were made public, is there any way I can train these models myself? I saw some code under relogic and pretrainkit which seems relevant for this but couldn’t figure out what data it uses and how to run it. Thanks!

wiskojo avatar Mar 31 '21 16:03 wiskojo

I also tried checking for the pre-train generators but no clue @wiskojo. Also GraPPa, which also uses a data augmentation strategy, so far has not made their code available.

PedroEstevesPT avatar Apr 01 '21 11:04 PedroEstevesPT

For the generator code, you can checkout https://github.com/awslabs/gap-text2sql/blob/main/relogic/sql-to-text-train.py and https://github.com/awslabs/gap-text2sql/blob/main/relogic/entity-to-text-train.py, which are sql to text generator and table to text generator. Will upload some data samples to help the understanding.

Impavidity avatar Aug 23 '21 03:08 Impavidity

For the generator code, you can checkout https://github.com/awslabs/gap-text2sql/blob/main/relogic/sql-to-text-train.py and https://github.com/awslabs/gap-text2sql/blob/main/relogic/entity-to-text-train.py, which are sql to text generator and table to text generator. Will upload some data samples to help the understanding.

Can you upload a README about how to run SQL-to-Text? Thanks a lot.

Fheon avatar Sep 23 '21 15:09 Fheon