audiveris icon indicating copy to clipboard operation
audiveris copied to clipboard

incorrect number of voices recognized in choral music

Open PHS-wpg opened this issue 4 years ago • 12 comments

Hi Folks,

I am making a score into a Music XML file. The first problem is the software sees 3 voices and the piano where there are 4 voices and a piano.

This is probably because the Tenors and basses are sharing the bass clef staff, and the notes are sung in unison at the beginning of the piece, and the voices do not split until later.

There is a simple indication in two places of the number of sung parts that might be useful in solving the issue.

  1. Under the title of the song are the words "for S.A.T.B. "
  2. before the fist note is sung, at the beginning of the treble clef staff are the words SOPRANO ALTO then, before the fist note is sung, at the beginning of the bass clef staff are the words TENOR BASS

I noticed that the OCR of words is done after the musical character recognition, and that may well be to help lyrics line up with the notes, however, you may want to run OCR of words twice, and use the first run through of the words to see if there is any indication of the number of voices, the second time for notations and lyrics. I am attaching a sample of the page for you to play with bridge over troubled waters page 1.pdf

PHS-wpg avatar Mar 05 '20 03:03 PHS-wpg

Is there any way to draw in the stems on the T and B unison? If both up and down stems are present will the current build recognize 2 voices in unison?

Best wishes, RG

On Mar 4, 2020, at 10:59 PM, PHS-wpg [email protected] wrote:



Hi Folks,

I am making a score into a Music XML file. The first problem is th softwar esees 3 voices and tyhe pino where there are 4 voices and a piano.

The problem is probably that the Tenors and basses are sharing a bass clef staff, and the notes are sung in unison at the beginning of the piece, and the voices so not split until later. There is a simple indication in two places that could be used to solve the issue.

  1. Under the title of the song are thE words "for S.A.T.B. "
  2. before the fist note is sung, at the beginning of the treble clef staff are the words SOPRANO ALTO then, before the fist note is sung, at the beginning of the bass clef staff are the words TENOR BASS

I noticed that the OCR of words is done after the music, and that may well be to help line them up with the notes, however, you may want to run OCR of words twice, and use the first run through to set the number of voices to export to music XML files by comparing the count of the number of voices the MCR software see to the number that is implied in writing, by searching the OCR text for the the sequences of characters "SATB S.A.T.B and some variations on Soprano Alto Tenor Bass. bridge over troubled waters page 1.pdfhttps://github.com/Audiveris/audiveris/files/4290895/bridge.over.troubled.waters.page.1.pdf

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Audiveris/audiveris/issues/363?email_source=notifications&email_token=AHIKKSKM6H6X47F5CY4GJODRF4PSLA5CNFSM4LCBGLU2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ISUWOHQ, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHIKKSKTLLXOGRXRXWBRVO3RF4PSLANCNFSM4LCBGLUQ.

Interesting idea RG. I tried, adding the notations with a pencil that turned out not to be dark enough, so, alternatively i could use caligraphy pen instead. I could also fiddle with images of the music in a graphics program, but that kind of defeats the point of MCR.

Are you familiar with the code itself?

PHS-wpg avatar Mar 05 '20 04:03 PHS-wpg

I am not familiar with the code. I am a music teacher and conductor, my limits are Finale Sibelius and troubleshooting the XML they export.

If we are really striving for OCR, then handwriting is a reasonable fix for anything as long as the handwritten lines match the contrast and color density of the rest of the score, and are in the limits of the expected shapes, right?

thank you all for the interesting work on this problem RG


From: PHS-wpg [email protected] Sent: Wednesday, March 4, 2020 11:48 PM To: Audiveris/audiveris [email protected] Cc: Richard Gard [email protected]; Comment [email protected] Subject: Re: [Audiveris/audiveris] incorrect number of voices recognized in choral music (#363)

Interesting idea RG. I tried, adding the notations with a pencil that turned out not to be dark enough, so, alternatively i could use caligraphy pen instead. I could also fiddle with images of the music in a graphics program, but that kind of defeats the point of MCR.

Are you familiar with the code itself?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Audiveris/audiveris/issues/363?email_source=notifications&email_token=AHIKKSIVJPPKNPHPZR4PNZTRF4VIFA5CNFSM4LCBGLU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN3WGNA#issuecomment-595026740, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHIKKSOSHQHV3SSG2QKE3VTRF4VIFANCNFSM4LCBGLUQ.

What is a voice?

I'm a poor musician, former guitarist, but when it's time to implement an OMR, you have to make clear decisions and stick to them:

  • Typically, several heads in the same chord (i.e. connected to the same stem) belong to the same voice.
  • Several heads, connected to different stems and thus several chords aligned one under the other, correspond to as many voices as there are chords present.
  • A single head, when connected to 2 stems, one up and the other down, is virtually duplicated into 2 heads of same pitch, belonging to 2 separate chords, and thus voices.

We could argue that in a staff dedicated to vocal, one can only sing one note at a time, whereas some intruments like piano or guitar, can produce several notes at the same time, and thus several "voices".

In your example, the tenor/bass staff contains only one voice. The fact that this single voice is sung identically by 2 persons, if not more, is beyond the scope of the OMR software. And it cannot seriously relie on the part name text that is often found on the left side of a part, because OCR'ed texts are not reliable enough.

hbitteur avatar May 12 '20 17:05 hbitteur

I agree, Hervé, although I have the same problem rather often. A good notation design should consequently use different stem directions for the 2 voices in a notation line (as it is done in the last measure). Additionally there are some textual advises like "tutti" or "solo for Altos". All these can only be corrected manually - or by a real AI-machine ;-)).

Bacchushlg avatar May 14 '20 16:05 Bacchushlg

hi hbitteur,

When I refer to a voice it do mean that quite literally, a choir voice.

if you look at a typical choral score it will have staves for the two hands of the piano, and a number of sung parts, which fit the description of Soprano, Alto, Tenor, Bass. Frequently, composers will write a score that contains two or more sub-parts. Bases are often written as Baritones and Bass, Other parts are the same, and sometimes are split into more than two parts.

Traditionally, musicians have made an effort to save paper, which is why the Coda exists for example. Another odd paper-saving method is to write two parts on the same staff with tails that point upwards and downwards from a note head of equal duration. Yet another is to place three note heads on one stem, so three singers or groups of them would sing one of the three notes, because no one singer can sing three different notes at the same time to make a chord.

I am attaching a few opening bars from a piece. It was written for soprano's altos and Basses to sing, I added the tails for the tenors and the half rest quickly.

As you can see the tail of the note on the Soprano and Alto parts points in the same direction, and represents 2 different notes for different groups of singers.

The way I see it the point of OMR is to transport those paper scores into an electronic form, which suggests that the OMR has to understand the odd traditions used by musicians who were saving paper.

sample of multiple head music

PHS-wpg avatar May 14 '20 23:05 PHS-wpg

Let's think about a compromise: again a new flag that enables splitting for a notation line. So a user can mark a notation line that forces the separation of chords into 2 voices (more will become really complex - although I have some of this kind, too). I such a case the notation should consequently interpreted as 2 voices over the complete line, meaning that a single note will be duplicated for both voices (including the rests!) It should not be a global flag for a complete score, because the piano tracks are mostly perfectly interpreted now. And sometimes choir scores include piano notation, too.

Bacchushlg avatar May 15 '20 13:05 Bacchushlg

That's an interesting idea. I assume you are referrring to a software flag set before the OMR starts to interpret scan of the music, but do correct me if I am wrong.

I am attaching a few bars from a song that shows how complicated the timing can get between 4 parts on two staves unequal timing of parts

PHS-wpg avatar May 19 '20 02:05 PHS-wpg

HI All,

I have been using audiveris quite a lot for the last month and then making adjustments in Musescore. It's been some months since I was here last, and with more experience of Audiveris, I realise why it does some of the things it does now. One of the realities is that the OCR of the text is less than perfect, and I have an idea on how to drastically change that. ask me if you're curious.

While I applaud the OMR for its astonishing accuracy and its ability to get it right most of the time, it occurs to me that maybe we are trying too hard to make it do things that it is ill-suited to do, like my original issue of "recognizing the number of vocal parts". Why not start each OMR session with a set of questions that establish the assumptions instead of forcing the computer to work them out?

I suspect any of us using OMR can easily look at a piece of music and read how many unique parts there are for a staff by observing the largest number of note-heads on a stem. For example, we could fill in a box before the OMR is launched that tells Audiveris some vital things about how the piece is written:

e.g. What form of music is this: Orchestral/ Choral / etc in a drop down menu number of Soprano vocal Lines, = 2 number of Alto vocal Line =1 number of Tenor vocal Lines =2 number of Bass vocal lines, =1 number of hands on the piano part =2 number of accompanying instruments= 1

Once the questions are answered, the OMR does what it does best, and puts notes on a staff, and it knows how many staffs there are going to be before it starts.

Would such a feature improve the accuracy of the OMR output ?

PHS-wpg avatar Aug 24 '20 20:08 PHS-wpg

Yes

Best wishes, Rick

Live slow, sail fast.


From: PHS-wpg [email protected] Sent: Monday, August 24, 2020 4:01:10 PM To: Audiveris/audiveris [email protected] Cc: Richard Gard [email protected]; Comment [email protected] Subject: Re: [Audiveris/audiveris] incorrect number of voices recognized in choral music (#363)

HI All,

I have been using audiveris quite a lot for the last month and then making adjustments in Musescore. Its been some months since I was here and with more experience of Audiveris I realise why it does some of the things it does now. One of the realities is that the OCR of the text is less than perfect, and I have an idea on how to drastically change that. ask me if you;'re curious.

While I applaud the OMR for its astonishing accuracy and its ability to get it right most of the time it occurs to me that maybe we are trying too hard to make it do things that it is ill-suited to do, like my original issue of "recognizing the number of vocal parts". Why not start each OMR session with a set if questions that establish the assumptions instead of forcing the computer to work them out?

I suspect any of us using OMR can easily look at a piece of music and read how many unique parts there are for a staff by observing the largest number of note-heads on a stem. For example, we could fill in a box before the OMR is launched that tells the system some vital things about how the piece is assembled: e.g. What form of music is this: Orchestral/ Choral / etc in a drop down menu

of Soprano vocal Lines, = 2 of Alto vocal Line =1 of Tenor vocal Lines =2 of Bass vocal lines, =1 of hands on the piano part =2 of accompanying instruments= 1

Once the questions are answered the OMR does what it does best, and puts notes on a staff, and it knows how man staffs there are going to be before it starts.

Would such a feature improve the accuracy of the OMR output ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/Audiveris/audiveris/issues/363#issuecomment-679336033, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AHIKKSIMVF73IFAPZN4VVZLSCLBINANCNFSM4LCBGLUQ.

I had a different idea in the past with finally almost the same access: split transcription into 2 phases:

  • analysis of the score structure
  • transcription An optional pause between the both could then allow the user to correct the structure. Giving finally what you propose above.

After structure analysis a couple of frames could show the detected systems, note lines, lyrics ranges, chords etc. The frames might overlap of course - they just show the ranges where later on these types of elements will be looked for.

Pausing between the steps might be controlled by a flag or even automatically - in case the the analysis comes the the conclusion that the structure is not unambiguous.

Bacchushlg avatar Aug 25 '20 06:08 Bacchushlg

I don't know what to do...

hbitteur avatar Aug 07 '21 16:08 hbitteur