mirdata
mirdata copied to clipboard
Adding loader for Filosax
All checkboxes checked!
Hi mirdata team,
I'm struggling to pass some of the tests, despite my best efforts… Firstly, there is failure in buildpy36/37/38
due to the lack of exception when returning a cached_property
(a wrapped dictionary of note properties), for which I've tried to fix with different permutations of exceptions. Secondly, there is an error in the docs: AttributeError: 'Values' object has no attribute 'section_self_link'
for which I've been unable to rectify. Would it be possible to advise the best approaches to these problems? Also, if it were possible to have this looked at pre-ISMIR, I would be eternally grateful!
Cheers,
Dave
Hi mirdata team,
I'm struggling to pass some of the tests, despite my best efforts… Firstly, there is failure in buildpy36/37/38 due to the lack of exception when returning a cached_property (a wrapped dictionary of note properties), for which I've tried to fix with different permutations of exceptions. Secondly, there is an error in the docs: AttributeError: 'Values' object has no attribute 'section_self_link' for which I've been unable to rectify. Would it be possible to advise the best approaches to these problems? Also, if it were possible to have this looked at pre-ISMIR, I would be eternally grateful!
Cheers,
Dave
Hi Dave, thanks for the PR! We'll do our best to take a look in the next day or two, to have it ready by the end of the ISMIR tutorials.
@dave-foster quick question - in the download info, you mention a script to run:
python Scripts/Compile_Backing.py -version full
I had a quick look at the conversation in the issues about this - where does this script live now? One option is to point to e.g. another github repo of yours with the script. Another is to create a subfolder in mirdata, e.g. mirdata/scripts/preprocessing
and create a preprocess_filosax.py
script in there. What do you prefer?
Hi Rachel, thanks for looking at this - I'm reminded of what a bummer it is that we aren't all making our way to Bangalore to be able to chat in person!
The script that you mention is in the material that users download from the Zenodo site (see a couple of lines earlier for the "Lite" link). This works fine as a pre-processing step, so could stay like that, or be incorporated into mirdata, whichever works best. Should I upload the script somewhere to see what you reckon?
I would point out that if any of this gives the impression that I have the faintest idea what I'm doing, then please don't be deceived!
Hi @rabitt et al, a friendly bump on this PR, to see if anybody could help me out in getting it approved? I've returned to the code this week to make some updates, and it turns out that I still haven't acquired the skills to get it through the unit tests!
hi! could we help you merge this? I think it currently fails.
Thanks Marius, I'd kind of given up hope of it ever getting merged, so any help you could give would be greatly appreciated!
ok if you could add the preprocessing script, take a look at the missing tests and the errors in circleci then I can take a look
Thanks Marius, the preprocessing script is in there already, and the whole reason that I haven't previously been able to get it merged is that I don't understand exactly why it is failing the circleci tests! I seem to remember trying a whole bunch of things to get it to work, but being thwarted at every attempt.
if you could try another commit and push then I could check. I am unable to see the errors at the moment . also..what errors do you get when running the tests locally? is everything ok there?
Thanks Marius, I've re-committed and generated the failing test logs again. Hopefully you can see them now? From what I can make out, the problem seems to be with the code for accessing the note annotations. These are available for each of the sax tracks, but not the backing track. It is complaining because an exception is raised when the user tries to access annotations on a backing track - but then it also complains (I seem to remember) when an exception isn't raised (and it fails gracefully)! I've tried various re-formatting of this code to try and get it to conform.
there are two errors here:
- one unrelated with this dataloader. I can look into that.
- one related with this dataloader concerning the notes property:
@core.cached_property
def notes(self) -> Optional[List[Note]]:
"""The track's note list - only for Sax files
Returns:
* [Note] - ordered list of Note objects
"""
note_path = "" if self.annotation_path == None else self.annotation_path
try:
with open(note_path) as fhandle:
note_dict = json.load(fhandle)["notes"]
except FileNotFoundError:
> raise FileNotFoundError("Note data only available for Sax tracks.")
E FileNotFoundError: Note data only available for Sax tracks.
mirdata/datasets/filosax.py:257: FileNotFoundError
let me fix the first and then we will get back to you
Thanks Marius, I'll stand by and await further instruction. The issue with the notes property is what has been thwarting the PR since the first attempt in November 2021 (see my initial comment)! The code originally failed gracefully when the user attempted to access the Notes
annotations on a backing track, but the circleci tests required an exception to be raised. When I raise an exception, of any type/style, this also causes the circleci tests to fail. I'm sure that there is a different way of coding it so that it works - I was darned if I could find it, though…
I have fixed the problem on the main branch. so you can update this PR with the master
Thanks Marius, I've updated with the master. Let me know if you have any suggestions on how to reformat to avoid the exception/no-exception problem.
hi Dave. sorry I have been busy and I couldn't look into this. looking at the code could you check if the track is not a backing track and then return the notes otherwise return an empty dict/list?
Thanks Marius, that's what I was doing in the first place. I've implemented it again - as before, it now complains because it's not raising an exception! Any other ideas?
hi Dave! there is an easy fix to that. at the moment you are doing
if note_path == "":
print("Note data only available for Sax tracks.")
note_dict = {}
else:
with open(note_path) as fhandle:
note_dict = json.load(fhandle)["notes"]
return [Note(n) for n in note_dict]
I propose you return an empty note list or create list with a single element, a dummy empty note
if note_path == "":
print("Note data only available for Sax tracks.")
note_dict = {}
return [Note(n) for n in note_dict]
else:
with open(note_path) as fhandle:
note_dict = json.load(fhandle)["notes"]
return [Note(n) for n in note_dict]
Thanks Marius - I'm afraid that your code doesn't work either. Returning an empty list, or one with a single dummy Note (or anything), still gives an error that it's expecting an Exception to be raised. Raising any kind of Exception also gives an error. Any other ideas?
Hello @dave-foster! I think you are forgetting to indicate, in the tests/test_loaders.py
the testing track. You should edit this file, and add your test file in CUSTOM_TEST_TRACKS
as "filosax": "multitrack_02_sax_1"
. That should help.
Also, you might want to define notes
in the Track class like this:
@core.cached_property
def notes(self) -> List[Note]:
"""The track's note list - only for Sax files
Returns:
* [Note] - ordered list of Note objects
"""
return load_notes(self.annotation_path)
and then have a function to load the notes like that:
@io.coerce_to_string_io
def load_notes(fhandle: BinaryIO) -> List[Note]:
"""docs here
Returns:
* List[Notes]
"""
if fhandle:
note_dict = json.load(fhandle)["notes"]
return [Note(n) for n in note_dict]
else:
return [Note[{"a_start_time": 0.0}]]
or maybe unifying this function with the load_annotations
function you do have below.
Hope this helps!!
Codecov Report
Merging #536 (6ec3ef1) into master (8db5795) will increase coverage by
0.02%
. The diff coverage is98.24%
.
@@ Coverage Diff @@
## master #536 +/- ##
==========================================
+ Coverage 96.61% 96.64% +0.02%
==========================================
Files 49 49
Lines 6004 6106 +102
==========================================
+ Hits 5801 5901 +100
- Misses 203 205 +2
By Jove, he's got it! Brilliant, thanks @genisplaja. I had no idea I needed to add the info to the tests/test_loaders.py
file.
I'm left now with a fail on the codecov/patch
test, where it looks like it's complaining that the annotations for the backing track files (bars, chords, track name, etc.) aren't being tested. (I think they are tested in the test_to_jams
function though). The dataset is formatted so that these are attributes of the multi-tracks, and the run_track_tests
tests a (child) sax track, which doesn't have them directly (because there are multiple sax tracks for each multi-track file). Any ideas to get this to work?
Great @dave-foster, sorry for the delay in finding that out :)
See the following example: https://github.com/mir-dataset-loaders/mirdata/blob/c4759b4c75ab9a3b364cf9ccb62b7adb65cf2c10/tests/datasets/test_saraga_carnatic.py#L475
In this case, an additional testing example is loaded here to test the case when a particular annotation is not available for certain tracks. You can just leave the default testing track in test_loaders.py
as it is now, and momentarily load an additional one to cover the untested lines. No need to include the additional track in test_loaders.py
. If I'm not mistaken, it should work as the example I attached.
Let me know if you have further questions!
Finally passed all tests! Excellent guidance, @genisplaja - your example showed me what I needed to do in the test file. In the end, I combined it all into the test_metadata
function (which was previously unpopulated). A few functions still aren't covered (notably the sax
property of the multitrack, as it would need to know that there were only 2 sax tracks in the toy test data), but enough to scrape through 0.02% over the target! Let me know if there's anything else that's needed for the review to be approved.
Hello @dave-foster! Did you see my small review? Do you need help with this? Thanks!!
Hi @genisplaja, thanks for the nudge, and apologies for the delay getting your prescribed fixes incorporated into the code now. I've addressed all of the issues that you mention, and have added comments to those where I'm not quite sure how to proceed. Let me know what I need to do from this point? I can't tell you how much I appreciate your intervention to get this over the line!
No problem @dave-foster! Happy to help :) This looks almost ready, I just gave a response to the open comments in my last review, and I observed in the docs the following issue:
File: filosax.py
, line 150: In these cases where you use []
(scale_changes [int]
, loudness_curve [float]
) it is not well rendered in the docs. You need to use ()
instead! Same with the cases in which you use {}
(line 147). I'd use just (dict)
in this case. The rest of the docs look ready :)
Thanks again @genisplaja - I think that's everything covered now!
I just left a tiny comment but looks ready to me. I'll ping @nkundiushuti (and whoever else is interested) to take a look and we can merge this. Thank you very much for your work and patience @dave-foster!
thanks @dave-foster . I left some comment regarding the Note class which I think could have been implemented using the existing functionalities. however I am happy with it and I think it can be merged