fromfilename plugin: proposed fix for #5218, some improvements
Description
b5216a0 is a proposed fix for #5218
We don't assume anymore that if the pattern lacks the "artist" group it also has the "title" group (this was causing the "title" KeyError, when the pattern containg only the ?P<track> group is used).
Instead, we check for existence of "title" before assigning "title_field", otherwise we let the last pattern (which will always match) to assign it. The only slight drawback is that files like "01.mp3" will also get "01" as title (besides the "1" track number), instead of having no title, but I guess that is not going to cause any major problem in the rest of the import procedure (e.g. track matching after a search). On the other hand, I guess allowing a file to have no title would require a substantial rewriting of the plugin, since the empty string is in BAD_TITLE_PATTERNS.
0966042 adds a log message about the pattern being tried
This is useful while debugging changes to the regexps in PATTERNS. I used loglevel "debug" for this message, should we use "debug" also for other messages (currently using level "info")?
e6b7735 refactors the regexps in PATTERNS
I reviewed the regexps (especially after the changes I made some time ago in 84cf336) and tried to make them cleaner.
- Allow " - " and "-" as separator between track/artist/title fields: that should cover two common filename conventions: using spaces (e.g.
01 - Artist Name - Title of the song) or not using spaces (e.g.01-artist_name-title_of_the_song). Some regexp groups are non-greedy, to avoid capturing unwanted leading/trailing spaces in artist/titles; we could think about some other enhancements, e.g. automatically converting underscores to spaces in artist/title names, but I guess the main use of the plugin is not to produce a perfectly formatted title, but a sufficient meaningful one to be used as reference in the following stages of the update procedure. - Allow
?P<tag>groups by making them optional in the first two patterns (instead of having two versions of the pattern, with or without tag). BTW I'm not sure what these groups are about: we optionally check for them in the plugin, but we never use them to assign any metadata. - Allow optional "." after the track number
- In the third pattern: also allow "_" as separator (e.g. for file names like
01_some_tack.mp3)
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.
Could you by any chance add some tests that show the range of filenames that can be parsed? This would be very useful going forward.
Second @snejus here! Thanks for the PR too. Regex is so complicated and obtuse to many people, so it's always good to have as complete a testing regimen in place as possible, both to find bugs with the provided patterns and to ensure that the same strings are matched going forward with any changes.
Ok, I'll have a look at how the tests machinery works and I'll give it a try.
(Feel free to pull the first commit of this PR to quickly fix the crash bug, if you agree with the proposed solution)
I've added tests for the plugin, covering almost all the use cases for the regexps in PATTERNS (as re-written by the previous commits in this PR).
To set up mock objects for the items to be imported I haven't used the tools in /beets/test, only plain unittest.mock, so maybe there's a more beets-esque way to implement these: suggestions are welcome.
Can I have any feedback about this PR?
I'm wondering especially if anything has changed after adopting poetry regarding how tests should be implemented, thanks!
Apologies - I didn't see this since I did not receive a review request (best to do this whenever your PR is ready to be re-reviewed).
There have been a fair bit of changes merged in since the last commit here, so you want to merge / rebase on master first, and then address formatting/linting issues.
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Is this still relevant?
Yes, I ran into this issue today and this change fixed it. Beets 2.1.0 / Linux.
@Vrihub I'd like to resurrect this PR. Do you feel like rebasing, fix the tests, etc??
I'll try to help as good as I can to get this merge ready. Thanks in advance!
@Vrihub I'd like to resurrect this PR. Do you feel like rebasing, fix the tests, etc??
Sure, first I have to switch my working copy to poetry. I'll give a shot at it later today or tomorrow.
OK, sorry for the delay (bad week), I've updated the PR to the latest changes in master, poetry etc.
I've been using this fix to the plugin during all this time and I haven't found problems when importing albums. The new tests seem to work fine too. So unless you have further suggestions, I think we should be ready to merge.
Note: with 0ac7fb4 I undo the fix implemented in https://github.com/beetbox/beets/pull/5907 because the case of a Null title is already handled (excluded) in the new version of the plugin contained in this PR.
The actual changes to the plugin look good to me!
Would you be open to rewriting the tests using pytest tho? It should be fairly straightforward and could significantly reduce boilerplate. I think most of the tests could be handled with a single pytest.mark.parametrize. I can also give it a try if you like.
In my opinion, using fixtures would make the tests more concise and easier to maintain over time.
Would you be open to rewriting the tests using pytest tho?
Ah sure, I'll take care of this.
Hi @snejus and @Vrihub, I fixed the changelog conflict and updated from master. Now ran into a tiny formatting issue which I solved in 152cafbf69de036253898a11349b8d59ecb6618b . @Vrihub maybe you want to rebase again to make this prettier. If not, all good. (I couldn't rebase/force-push to this branch but realized that too late).
By the way I'm using this feature in my main beets branch now and it works well so far :-)
Also had a look at the pytest refactor and like it. In short: This LGTM, @snejus might want to give it the final review! Thanks in advance to both of you!