prosodic icon indicating copy to clipboard operation
prosodic copied to clipboard

python-Levenshtein can't build wheel

Open tnhaider opened this issue 4 years ago • 20 comments

Just a heads up: I cannot install prosodic because pip fails to build a wheel for python-Levenshtein.

I looks like multiple people have this problem, but nothing is being done about it: https://github.com/ztane/python-Levenshtein/issues

Is there a chance you could switch to a different implementation of Levenshtein distance?

Cheers.

tnhaider avatar Oct 08 '20 00:10 tnhaider

Thanks for pointing this out. Will try to fix today. Sorry to leave prosodic a little rusty btw: let me see if I can respond to these issues soon.

quadrismegistus avatar Oct 08 '20 09:10 quadrismegistus

Are you on Mac OSX?

quadrismegistus avatar Oct 08 '20 09:10 quadrismegistus

I'm looking around and I can't find any levenshtein packages that do exactly what prosodic needs besides the one included here. Also reluctant to give up its written-in-C speeds.

What I could do is try to get a conda-based installation of prosodic working. That would allow conda install python-levenshtein which should be able to install a precompiled binary for any OS. Not sure if that's overkill for this problem though

quadrismegistus avatar Oct 08 '20 10:10 quadrismegistus

Thanks for your quick reply.

I'm on CentOS Linux in a Cluster. ;)

Conda might actually work. I've been able to install levenshtein through that.

And there is no hurry. I implemented my own data driven prosody detection system for poetry, and I'd like to use prosodic as baseline. Your litlab dataset has already proved quite useful.

Best, TH

tnhaider avatar Oct 08 '20 17:10 tnhaider

Ok, let me know if you run into any issues installing prosodic via conda? If it works, I may look into adding prosodic as a conda package so it can list its conda dependencies.

Also, wow, that sounds really interesting. The world of computational prosody is small so it's always nice to hear of new stuff going on! Are you working primarily with German poetry? And are you taking a rule-based or a machine learning approach? Would love to chat more sometime.

quadrismegistus avatar Oct 09 '20 09:10 quadrismegistus

I am working for English and German both.

And my current best system is a multi-task bilstm-crf with pretrained syllable embeddings.

We also annotated a sizeable amount of gold data for both languages. I can send you the current manuscript via mail if you'd like.

tnhaider avatar Oct 09 '20 10:10 tnhaider

Very interesting! I'd love to read more. I'm @ [email protected]

quadrismegistus avatar Oct 09 '20 10:10 quadrismegistus

I repackged the levensthein library to use wheels so you don't have to compile it. I haven't tested it much but you can install it with the command below, maybe it'll fix your issue.

pip install levenshtein

polm avatar Jan 11 '21 14:01 polm

Hi Ryan,

Thanks for having a look at it.

Unfortunately, the dependency in prosodic is still python-Levensthein, so the install of prosodic fails regardless.

Is there a way how I can change the dependency, or should I just compile the latest build?

Thanks, Tom

tnhaider avatar Jan 23 '21 11:01 tnhaider

Alright, I just got your new levenshtein, cloned the repo and started prosodic.py itself. Seems to work so far.

tnhaider avatar Jan 23 '21 11:01 tnhaider

Ok, it seems I also get a 'tagged_samples' error:

>> [0.0s] prosodic:en$ /corpus ../corpora/corppoetry_en/en.whitman.txt

	[please type a line of text, or enter one of the following commands:]
		/text	load a text
		/corpus	load folder of texts
		/paste	enter multi-line text

		/show	show annotations on input
		/tree	see phonological structure
		/query	query annotations

		/parse	parse metrically
		/meter	set the meter used for parsing
		/eval	evaluate this meter against a hand-tagged sample
		/maxent	learn weights for meter using maxent

		/save	save previous output to file (except for /weight and /weight2; see /weightsave)
		/scan	print out the scanned lines
		/report	look over the parse outputs
		/stats	save statistics from the parser

		/mute	hide output from screen
		/exit	exit

>> [17.88s] prosodic:en$ /eval
Traceback (most recent call last):
  File "prosodic.py", line 558, in <module>
    path=os.path.join(dir_prosodic,config['folder_tagged_samples'])
KeyError: 'folder_tagged_samples'

The same happens if I execute prosodic.py from the parent directory.

I tried renaming the path setup to 'tagged_samples', but it just changes the key name, that is also not found.

557                 elif text.startswith('/eval'):
558                         path=os.path.join(dir_prosodic,config['folder_tagged_samples'])
559                         fn=None

I am not sure what do with this in the config:

# ############################################
# @DEPRECATED
# # PATHS USED BY PROSODIC
# #
# # If these are relative paths (no leading /),
# # they are defined from the point of view of
# # the root directory of PROSODIC.
# #
# # Folder used as the folder of corpora:
# # [it should contain folders, each of which contains text files]
# folder_corpora='corpora/'
# #
# # Folder to store results within (statistics, etc)
# folder_results='results/'
# #
# # Folder in which tagged samples (hand-parsed lines) are stored:
# folder_tagged_samples = 'tagged_samples/'
# ############################################

tnhaider avatar Jan 23 '21 12:01 tnhaider

hm, let me look into all this. does uncommenting the # folder_tagged_samples = 'tagged_samples/' line in prosodic/config.py change things? are you using pip version? if so, does installing from repo (pip install -U git+https://github.com/quadrismegistus/prosodic) help? Been a while since I dove back into code; will try to do that now

All best, Ryan

On Jan 23 2021, at 12:33 pm, philaut [email protected] wrote:

Ok, it seems I also get a 'tagged_samples' error:

[0.0s] prosodic:en$ /corpus ../corpora/corppoetry_en/en.whitman.txt

[please type a line of text, or enter one of the following commands:] /text load a text /corpus load folder of texts /paste enter multi-line text

/show show annotations on input /tree see phonological structure /query query annotations

/parse parse metrically /meter set the meter used for parsing /eval evaluate this meter against a hand-tagged sample /maxent learn weights for meter using maxent

/save save previous output to file (except for /weight and /weight2; see /weightsave) /scan print out the scanned lines /report look over the parse outputs /stats save statistics from the parser

/mute hide output from screen /exit exit

[17.88s] prosodic:en$ /eval Traceback (most recent call last): File "prosodic.py", line 558, in path=os.path.join(dir_prosodic,config['folder_tagged_samples']) KeyError: 'folder_tagged_samples' — You are receiving this because you commented. Reply to this email directly, view it on GitHub (https://github.com/quadrismegistus/prosodic/issues/20#issuecomment-765958424), or unsubscribe (https://github.com/notifications/unsubscribe-auth/AAFTFHIJKUET3AK5CFGVD43S3K62TANCNFSM4SICHDBA).

quadrismegistus avatar Jan 23 '21 12:01 quadrismegistus

Will try asap.

If we can get this done by monday, you will be in the experiments section of my paper which got accepted to EACL btw. :)

I basically just need to figure out how I can evaluate my manual annotation against prosodic. I want to determine the accuracy of the meter annotation on syllable and line level.

tnhaider avatar Jan 23 '21 12:01 tnhaider

Nope, pip install -U git+https://github.com/quadrismegistus/prosodic doesn't work because it wants to install `python-Levensthein' for which I can't build the wheel.

Uncommenting folder_tagged_samples = 'tagged_samples/' in the config I get either

>> [0.0s] prosodic:en$ /text tagged_samples/tagged-sample-litlab-2016.txt
<file not found>

or

>> [24.65s] prosodic:en$ /eval ../tagged_samples/tagged-sample-litlab-2016.txt
Traceback (most recent call last):
  File "prosodic.py", line 563, in <module>
    for _fn in os.listdir(path):
FileNotFoundError: [Errno 2] No such file or directory: 'tagged_samples/'

Loading text with /text corppoetry_en/en.shakesspeare.txt does work however. Then doing a /parse also works. But then doing /eval gets me an error.

>> [8.67s] prosodic:en$ /eval
Traceback (most recent call last):
  File "prosodic.py", line 563, in <module>
    for _fn in os.listdir(path):
FileNotFoundError: [Errno 2] No such file or directory: 'tagged_samples/'

tnhaider avatar Jan 23 '21 13:01 tnhaider

With importing it into my my own script, it looks promising now.

Nice tutorial btw!

tnhaider avatar Jan 23 '21 13:01 tnhaider

@polm Thanks for your help! So should I just change "python-Levenshtein" in requirements.txt to "levenshtein"?

quadrismegistus avatar Jan 23 '21 16:01 quadrismegistus

Yes, that should solve the problem.

I am not sure about the config path change though. I might have a look at that later.

tnhaider avatar Jan 23 '21 17:01 tnhaider

@quadrismegistus Sure, that should work.

I am not sure about the future of that particular pip package yet. The version that's there now won't go away, but maybe development will resume at the old name later.

polm avatar Jan 24 '21 04:01 polm

Did we ever figure this out?

quadrismegistus avatar Dec 06 '21 10:12 quadrismegistus

My wheels are still up if you want to use them and I don't see that changing, but I haven't done any other work on the project and don't have time to work on it going forward.

For the original levenshtein project, the maintainer contacted me about taking over, but by that point I was already busy with other work and had to refuse. Someone else stepped up not long after and volunteered to be maintainer but hasn't gotten a response.

You can follow progress here.

https://github.com/ztane/python-Levenshtein/issues/61

polm avatar Dec 07 '21 09:12 polm