yalign
yalign copied to clipboard
Example not working
Hi. I'm trying to use yalign and to start I read the docs and did run:
sudo pip install yalign
Followed by
wget https://raw.githubusercontent.com/machinalis/yalign/develop/data/models/0.1/en-es.tar.gz tar -xvzf en-es.tar.gz
And at last:
yalign-align en-es http://en.wikipedia.org/wiki/Antiparticle http://es.wikipedia.org/wiki/Antipart%C3%ADcula
And there was nothing written on the stdout. Is it expected?
If it's important: I'm using python 2.7.9 on Debian.
Thank you for your time.
Hey, I've been digging a little bit into this.
First of all I've noticed that we don't have a fixed version of Scikit-Learn. This means it uses the most recent one and we might be outdated on some things.
That means that the trained model that we provide, could or could not work with the Scikit-Learn version that you're using. I did a clean installation and it didn't work. If this is the case, re-building the model is necessary. Fortunately there's a tutorial on how to generate a model.
After that you'll run into another issue. Apparently the code that downloads the data from an URL isn't following redirections, and Wikipedia is now redirecting all http
content to the https
. This means that you could try the https url to avoid that problem. Another option would be to use plain text files instead of urls.
Even after all that, you might encounter another issue. I've seen the tokenizer code on the project and it might be outdated. Wasn't working on my clean installation but i didn't had time enough to debug it. If this is the case, i've prepared a branch with some fixes that got it running for me. To use that version instead of the one on PyPi, you'll have to do this:
Remove your version:
pip uninstall yalign
Get the code:
git clone -b issue-6-empty-response https://github.com/machinalis/yalign.git
Install this version of the code:
pip install -e yalign
Hope this gets you anywhere near your objective. Let us know how it goes.
Worked for me. Many thanks!
Hey, I've been digging a little bit into this.
First of all I've noticed that we don't have a fixed version of Scikit-Learn. This means it uses the most recent one and we might be outdated on some things.
That means that the trained model that we provide, could or could not work with the Scikit-Learn version that you're using. I did a clean installation and it didn't work. If this is the case, re-building the model is necessary. Fortunately there's a tutorial on how to generate a model.
After that you'll run into another issue. Apparently the code that downloads the data from an URL isn't following redirections, and Wikipedia is now redirecting all
http
content to thehttps
. This means that you could try the https url to avoid that problem. Another option would be to use plain text files instead of urls.Even after all that, you might encounter another issue. I've seen the tokenizer code on the project and it might be outdated. Wasn't working on my clean installation but i didn't had time enough to debug it. If this is the case, i've prepared a branch with some fixes that got it running for me. To use that version instead of the one on PyPi, you'll have to do this:
Remove your version:
pip uninstall yalign
Get the code:
git clone -b issue-6-empty-response https://github.com/machinalis/yalign.git
Install this version of the code:
pip install -e yalign
Hope this gets you anywhere near your objective. Let us know how it goes.
Hi there! This is what I got when I ran the exactly same code on Debian 9. Could anyone tell me what went wrong here? Many thanks!
Successfully installed scikit-learn-0.17.1 root@instance-4:~# yalign-align en-es https://en.wikipedia.org/wiki/Antiparticle https://es.wikipedia.org/wiki/Antipa rt%C3%ADcula /usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample. DeprecationWarning) Traceback (most recent call last): File "/usr/local/bin/yalign-align", line 7, in <module> exec(compile(f.read(), __file__, 'exec')) File "/root/yalign/scripts/yalign-align", line 66, in <module> pairs = model.align(document_a, document_b) File "/root/yalign/yalign/yalignmodel.py", line 130, in align alignments = self.align_indexes(document_a, document_b) File "/root/yalign/yalign/yalignmodel.py", line 138, in align_indexes alignments = self.document_pair_aligner(document_a, document_b) File "/root/yalign/yalign/sequencealigner.py", line 34, in __call__ node = astar(problem, graph_search=True) File "/usr/local/lib/python2.7/dist-packages/simpleai/search/traditional.py", line 121, in astar viewer=viewer) File "/usr/local/lib/python2.7/dist-packages/simpleai/search/traditional.py", line 156, in _search expanded = node.expand() File "/usr/local/lib/python2.7/dist-packages/simpleai/search/models.py", line 105, in expand for action in self.problem.actions(self.state): File "/root/yalign/yalign/sequencealigner.py", line 68, in actions w = self.W(a, b) File "/root/yalign/yalign/sentencepairscore.py", line 57, in __call__ score = self.classifier.score(a) * self.sign File "/root/yalign/yalign/svm.py", line 51, in score return float(self.svm.decision_function(vector)) File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 542, in decision_function dec = self._decision_function(X) File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 405, in _decision_function dec_func = self._dense_decision_function(X) File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 423, in _dense_decision_function self._dual_coef_, self._intercept_, AttributeError: 'SVC' object has no attribute '_dual_coef_'