AugmentedNet
AugmentedNet copied to clipboard
Rntxt writing (and possibly higher up the workflow)
Picking up from https://github.com/MarkGotham/When-in-Rome/pull/47, I think here are two rntxt issues appropriate for reporting here.
-
Augmentednet uses an explicit
/i
designation to indicate mixture into major. I quite like the idea and can certainly see how handling cases likeVI
get easier with this, but we should try to avoid introducing such non-standard syntax. Perhaps a first step is to remove it in cases where it definitely makes no difference anyway (e.g.,viio7/i
>viio7
). -
Use of
Cad64
encouraged. Augmentednet does outputCad64
, but alsoI64
in cases whereCad64
would seem appropriate. Can you review this corner? If it helps, I'm happy to work through some explicit rules for testing (e.g.,Cad64
toV7
butV64
betweenI
andI6
) etc.
No doubt I'll be back with more when I get a chance to look more closely ...
Thanks for the reports, @MarkGotham!
-
Yes. It should be easy to postprocess some annotations (like the
viio7/i
->viio7
you suggested). There is one caveat, though. The way that the vocabulary of annotations is defined is not entirely how it would look like in a regularrntxt
file. For example, there is a reason forviio7/i
to be a "tonicization ofi
" and notviio7/I
. I considered the chords that are not diatonic to the key to be nonexistent in that key. Taking the example of the diminished seventh,viiø7/I
would be written asviiø7
; butviio7/I
would be written asviio7/i
, because it "doesn't exist" inI
. The rationale for this is to force the tonicization finder of the model to think that we are deviating briefly to the parallel key. Currently, the annotations respect the idiosyncrasies of the underlying model. Of course, these can be overwritten for an easier presentation in therntxt
file, but I am still debating internally whether it is best to faithfully represent the tonal representation of the model, or the annotations that would be more idiomatic to read. This is the kind of thing that needs a beer and a discussion :). -
It does output
Cad64
, but as you noticed, the number ofCad64
annotations output by the model is extremely low. In fact, it only outputs ~6% of theCad64
in the test set. I attribute this to the fact that several datasets do not encodeCad64
chords, my haydn annotations, for example. I guess that confuses the model and the performance of this special label is very low at the moment. It is possible, as you mention, to write some rules to writeCad64
explicitly based on the context. I prefer to limit the amount of tampering I do to the model's predictions to the minimum, but I understand the value of those rules. Do you think it would be possible to implement this post-processing as an external module toAugmentedNet
? For example, given the originalrntxt
provided by the model and the score, determine additional chords that should changeI64
->Cad64
. The real solution to this problem in the long run is, we need more examples ofCad64
in the training set(s). Then the neural network will do a better job. Also, I'm happy to report that no other state-of-the-art model outputs or considersCad64
chords in their vocabulary. Thus, 6% is still better than 0% :). Hopefully, other models will adopt this and the field will move forward.