mirdata icon indicating copy to clipboard operation
mirdata copied to clipboard

Adding loader for Filosax

Open dave-foster opened this issue 3 years ago • 24 comments

All checkboxes checked!

dave-foster avatar Nov 04 '21 00:11 dave-foster

Hi mirdata team,

I'm struggling to pass some of the tests, despite my best efforts… Firstly, there is failure in buildpy36/37/38 due to the lack of exception when returning a cached_property (a wrapped dictionary of note properties), for which I've tried to fix with different permutations of exceptions. Secondly, there is an error in the docs: AttributeError: 'Values' object has no attribute 'section_self_link' for which I've been unable to rectify. Would it be possible to advise the best approaches to these problems? Also, if it were possible to have this looked at pre-ISMIR, I would be eternally grateful!

Cheers,

Dave

dave-foster avatar Nov 04 '21 11:11 dave-foster

Hi mirdata team,

I'm struggling to pass some of the tests, despite my best efforts… Firstly, there is failure in buildpy36/37/38 due to the lack of exception when returning a cached_property (a wrapped dictionary of note properties), for which I've tried to fix with different permutations of exceptions. Secondly, there is an error in the docs: AttributeError: 'Values' object has no attribute 'section_self_link' for which I've been unable to rectify. Would it be possible to advise the best approaches to these problems? Also, if it were possible to have this looked at pre-ISMIR, I would be eternally grateful!

Cheers,

Dave

Hi Dave, thanks for the PR! We'll do our best to take a look in the next day or two, to have it ready by the end of the ISMIR tutorials.

rabitt avatar Nov 06 '21 14:11 rabitt

@dave-foster quick question - in the download info, you mention a script to run:

python Scripts/Compile_Backing.py -version full

I had a quick look at the conversation in the issues about this - where does this script live now? One option is to point to e.g. another github repo of yours with the script. Another is to create a subfolder in mirdata, e.g. mirdata/scripts/preprocessing and create a preprocess_filosax.py script in there. What do you prefer?

rabitt avatar Nov 06 '21 14:11 rabitt

Hi Rachel, thanks for looking at this - I'm reminded of what a bummer it is that we aren't all making our way to Bangalore to be able to chat in person!

The script that you mention is in the material that users download from the Zenodo site (see a couple of lines earlier for the "Lite" link). This works fine as a pre-processing step, so could stay like that, or be incorporated into mirdata, whichever works best. Should I upload the script somewhere to see what you reckon?

I would point out that if any of this gives the impression that I have the faintest idea what I'm doing, then please don't be deceived!

dave-foster avatar Nov 06 '21 14:11 dave-foster

Hi @rabitt et al, a friendly bump on this PR, to see if anybody could help me out in getting it approved? I've returned to the code this week to make some updates, and it turns out that I still haven't acquired the skills to get it through the unit tests!

dave-foster avatar Jan 19 '22 13:01 dave-foster

hi! could we help you merge this? I think it currently fails.

nkundiushuti avatar Jul 18 '22 12:07 nkundiushuti

Thanks Marius, I'd kind of given up hope of it ever getting merged, so any help you could give would be greatly appreciated!

dave-foster avatar Jul 18 '22 12:07 dave-foster

ok if you could add the preprocessing script, take a look at the missing tests and the errors in circleci then I can take a look

nkundiushuti avatar Jul 18 '22 13:07 nkundiushuti

Thanks Marius, the preprocessing script is in there already, and the whole reason that I haven't previously been able to get it merged is that I don't understand exactly why it is failing the circleci tests! I seem to remember trying a whole bunch of things to get it to work, but being thwarted at every attempt.

dave-foster avatar Jul 18 '22 13:07 dave-foster

if you could try another commit and push then I could check. I am unable to see the errors at the moment . also..what errors do you get when running the tests locally? is everything ok there?

nkundiushuti avatar Jul 18 '22 14:07 nkundiushuti

Thanks Marius, I've re-committed and generated the failing test logs again. Hopefully you can see them now? From what I can make out, the problem seems to be with the code for accessing the note annotations. These are available for each of the sax tracks, but not the backing track. It is complaining because an exception is raised when the user tries to access annotations on a backing track - but then it also complains (I seem to remember) when an exception isn't raised (and it fails gracefully)! I've tried various re-formatting of this code to try and get it to conform.

dave-foster avatar Jul 18 '22 14:07 dave-foster

there are two errors here:

  • one unrelated with this dataloader. I can look into that.
  • one related with this dataloader concerning the notes property:
    @core.cached_property
    def notes(self) -> Optional[List[Note]]:
        """The track's note list - only for Sax files
    
        Returns:
            * [Note] - ordered list of Note objects
    
        """
        note_path = "" if self.annotation_path == None else self.annotation_path
        try:
            with open(note_path) as fhandle:
                note_dict = json.load(fhandle)["notes"]
        except FileNotFoundError:
>           raise FileNotFoundError("Note data only available for Sax tracks.")
E           FileNotFoundError: Note data only available for Sax tracks.

mirdata/datasets/filosax.py:257: FileNotFoundError

let me fix the first and then we will get back to you

nkundiushuti avatar Jul 19 '22 14:07 nkundiushuti

Thanks Marius, I'll stand by and await further instruction. The issue with the notes property is what has been thwarting the PR since the first attempt in November 2021 (see my initial comment)! The code originally failed gracefully when the user attempted to access the Notes annotations on a backing track, but the circleci tests required an exception to be raised. When I raise an exception, of any type/style, this also causes the circleci tests to fail. I'm sure that there is a different way of coding it so that it works - I was darned if I could find it, though…

dave-foster avatar Jul 21 '22 08:07 dave-foster

I have fixed the problem on the main branch. so you can update this PR with the master

nkundiushuti avatar Jul 22 '22 09:07 nkundiushuti

Thanks Marius, I've updated with the master. Let me know if you have any suggestions on how to reformat to avoid the exception/no-exception problem.

dave-foster avatar Jul 22 '22 14:07 dave-foster

hi Dave. sorry I have been busy and I couldn't look into this. looking at the code could you check if the track is not a backing track and then return the notes otherwise return an empty dict/list?

nkundiushuti avatar Aug 05 '22 14:08 nkundiushuti

Thanks Marius, that's what I was doing in the first place. I've implemented it again - as before, it now complains because it's not raising an exception! Any other ideas?

dave-foster avatar Aug 07 '22 15:08 dave-foster

hi Dave! there is an easy fix to that. at the moment you are doing

        if note_path == "":
             print("Note data only available for Sax tracks.")
             note_dict = {}
         else:
             with open(note_path) as fhandle:
                 note_dict = json.load(fhandle)["notes"]
         return [Note(n) for n in note_dict]

I propose you return an empty note list or create list with a single element, a dummy empty note

        if note_path == "":
             print("Note data only available for Sax tracks.")
             note_dict = {}
             return [Note(n) for n in note_dict]
         else:
             with open(note_path) as fhandle:
                 note_dict = json.load(fhandle)["notes"]
         return [Note(n) for n in note_dict]

nkundiushuti avatar Aug 08 '22 09:08 nkundiushuti

Thanks Marius - I'm afraid that your code doesn't work either. Returning an empty list, or one with a single dummy Note (or anything), still gives an error that it's expecting an Exception to be raised. Raising any kind of Exception also gives an error. Any other ideas?

dave-foster avatar Aug 08 '22 10:08 dave-foster

Hello @dave-foster! I think you are forgetting to indicate, in the tests/test_loaders.py the testing track. You should edit this file, and add your test file in CUSTOM_TEST_TRACKS as "filosax": "multitrack_02_sax_1". That should help. Also, you might want to define notes in the Track class like this:

@core.cached_property
def notes(self) -> List[Note]:
   """The track's note list - only for Sax files

   Returns:
       * [Note] - ordered list of Note objects

   """
   return load_notes(self.annotation_path)

and then have a function to load the notes like that:

@io.coerce_to_string_io
def load_notes(fhandle: BinaryIO) -> List[Note]:
    """docs here

    Returns:
        * List[Notes]

    """
    if fhandle:
        note_dict = json.load(fhandle)["notes"]
        return [Note(n) for n in note_dict]
    else:
        return [Note[{"a_start_time": 0.0}]]

or maybe unifying this function with the load_annotations function you do have below.

Hope this helps!!

genisplaja avatar Aug 08 '22 20:08 genisplaja

Codecov Report

Merging #536 (6ec3ef1) into master (8db5795) will increase coverage by 0.02%. The diff coverage is 98.24%.

@@            Coverage Diff             @@
##           master     #536      +/-   ##
==========================================
+ Coverage   96.61%   96.64%   +0.02%     
==========================================
  Files          49       49              
  Lines        6004     6106     +102     
==========================================
+ Hits         5801     5901     +100     
- Misses        203      205       +2     

codecov[bot] avatar Aug 09 '22 18:08 codecov[bot]

By Jove, he's got it! Brilliant, thanks @genisplaja. I had no idea I needed to add the info to the tests/test_loaders.py file.

I'm left now with a fail on the codecov/patch test, where it looks like it's complaining that the annotations for the backing track files (bars, chords, track name, etc.) aren't being tested. (I think they are tested in the test_to_jams function though). The dataset is formatted so that these are attributes of the multi-tracks, and the run_track_tests tests a (child) sax track, which doesn't have them directly (because there are multiple sax tracks for each multi-track file). Any ideas to get this to work?

dave-foster avatar Aug 09 '22 18:08 dave-foster

Great @dave-foster, sorry for the delay in finding that out :)

See the following example: https://github.com/mir-dataset-loaders/mirdata/blob/c4759b4c75ab9a3b364cf9ccb62b7adb65cf2c10/tests/datasets/test_saraga_carnatic.py#L475 In this case, an additional testing example is loaded here to test the case when a particular annotation is not available for certain tracks. You can just leave the default testing track in test_loaders.py as it is now, and momentarily load an additional one to cover the untested lines. No need to include the additional track in test_loaders.py. If I'm not mistaken, it should work as the example I attached.

Let me know if you have further questions!

genisplaja avatar Aug 09 '22 18:08 genisplaja

Finally passed all tests! Excellent guidance, @genisplaja - your example showed me what I needed to do in the test file. In the end, I combined it all into the test_metadata function (which was previously unpopulated). A few functions still aren't covered (notably the sax property of the multitrack, as it would need to know that there were only 2 sax tracks in the toy test data), but enough to scrape through 0.02% over the target! Let me know if there's anything else that's needed for the review to be approved.

dave-foster avatar Aug 10 '22 15:08 dave-foster

Hello @dave-foster! Did you see my small review? Do you need help with this? Thanks!!

genisplaja avatar Sep 21 '22 15:09 genisplaja

Hi @genisplaja, thanks for the nudge, and apologies for the delay getting your prescribed fixes incorporated into the code now. I've addressed all of the issues that you mention, and have added comments to those where I'm not quite sure how to proceed. Let me know what I need to do from this point? I can't tell you how much I appreciate your intervention to get this over the line!

dave-foster avatar Sep 26 '22 16:09 dave-foster

No problem @dave-foster! Happy to help :) This looks almost ready, I just gave a response to the open comments in my last review, and I observed in the docs the following issue:

File: filosax.py, line 150: In these cases where you use [] (scale_changes [int], loudness_curve [float]) it is not well rendered in the docs. You need to use () instead! Same with the cases in which you use {} (line 147). I'd use just (dict) in this case. The rest of the docs look ready :)

genisplaja avatar Sep 27 '22 13:09 genisplaja

Thanks again @genisplaja - I think that's everything covered now!

dave-foster avatar Sep 28 '22 09:09 dave-foster

I just left a tiny comment but looks ready to me. I'll ping @nkundiushuti (and whoever else is interested) to take a look and we can merge this. Thank you very much for your work and patience @dave-foster!

genisplaja avatar Sep 28 '22 10:09 genisplaja

thanks @dave-foster . I left some comment regarding the Note class which I think could have been implemented using the existing functionalities. however I am happy with it and I think it can be merged

nkundiushuti avatar Oct 03 '22 15:10 nkundiushuti