vecalign
vecalign copied to clipboard
Improved Sentence Alignment in Linear Time and Space
I've tried to re-run the project. However, when I tried to use [LASOR](https://github.com/facebookresearch/LASER) to embed my embedding with the command line: ```bash ./embed.sh ~/source/samsung/vecalign/bleualign_data/overlaps.vi ~/source/samsung/vecalign/bleualign_data/overlaps.vi.emb [vi] ``` or ```bash python...
Thanks for sharing the excellent source code. I am confused about the vector half function: ``` def downsample_vectors(vecs1): a, b, c = vecs1.shape half = np.empty((a, b // 2, c),...
@thompsonb, I'm trying to replicate the work done in your paper, the results in the Table 1 in particular. How did you convert the format of the dataset that you...
In the make_del_knob function, when the size product (e_size * f_size) is smaller than the sample_size (20000 by default), the script ends up calculating the similarity score for all combinations...
The embed output is 1.4T and it's too large to load this array to memory. Any tips for this?
Hi, In the evaluation script (score.py), precisely here: https://github.com/thompsonb/vecalign/blob/ca96a30716f12241e14f836b06705107c771987c/score.py#L57C5-L57C5 I've noticed that you cycle in the for loop based on the variable "testalign", which should contain the alignment generated by...
Pherhaps silly quastion, but in the demo, it seems like you create the files with the overlapping sentences with the dev and the test files. In my case, I just...
I'm not sure if you're taking PRs or not. If so, let me know and I'll submit some fixes that will be helpful for getting started.