Richard Shin

Results 13 issues of Richard Shin

The README says: > Content store caching is used by default for all external steps, and can be enabled for internal computations by providing suitable serialisation/deserialisation functions. Are there any...

question

Even though all numbers should have been stripped out by disabling the generation of literals.

For choosing a table in Spider, we allow the model to point to the embedding for any of its columns or the embedding for the table itself: https://github.com/rshin/seq2struct/blob/e69c8eb182ec80770cde4cd369e1b698bc8e921a/seq2struct/models/spider_enc.py#L321-L328 However, when...

CoreNLP runs in the JVM in a separate process, which makes it annoying to use. spaCy should be a sufficient replacement for tokenization for looking up GloVe embeddings.

- [ ] Rewrite https://github.com/rshin/seq2struct/blob/master/seq2struct/models/lstm.py to use TorchScript instead - [ ] Test all major models

Example: ``` In [6]: train_enc = json.loads(next(open('data/spider-20190205/nl2code-0401,output_from=false,emb=glove-42B,min_freq=50/enc/train.jsonl'))) In [7]: train_enc Out[7]: {'column_to_table': {'0': None, '1': 0, '10': 1, '11': 2, '12': 2, '13': 2, '2': 0, '3': 0, '4': 0,...

Should mention in README: - [ ] More detailed instructions about installing packages and dependencies Fix `generate.sh`: - [ ] Make sure that Python setup is correct (version, etc) -...

- [x] Finish token generation mechanism - [x] Create encoder - [x] Copying tokens - [x] Attention - [x] Create code to read data - [x] Bahdanau attention - [x]...

This probably broke the CMake build because I needed to make some of the includes less relative, but I'll fix that later.