forced-alignment-tools icon indicating copy to clipboard operation
forced-alignment-tools copied to clipboard

Suggestion: Include granularity

Open MysteryPancake opened this issue 6 years ago • 6 comments

As stated in the README:

A text fragment can have arbitrary granularity: a paragraph, a sentence, a portion of a sentence (i.e., a group of words), a word, or a phoneme (i.e., a single sound).

This information would be useful to include in the table.

While aligners such as Gentle and SPPAS allow phone-level alignment, others such as aeneas can only perform word-level alignment.

Perhaps there could be a column indicating the granularity of each aligner?

MysteryPancake avatar May 23 '19 01:05 MysteryPancake

i think the Maximum length of audio and the granularity can be really useful

ArtemisZGL avatar Jul 19 '19 07:07 ArtemisZGL

Can someone here please start editing the page? It appears that @pettarin has been inactive for one and a half years same goes for Aeneas which is co-owned by him. EDIT by AP: removed link to an irrelevant page His email would be EDIT by AP: removed email address and his twitter is EDIT by AP: removed twitter username If it is possible please contact him.

CC to @pfriesch @peteruhrig because this project is in a bad state.

DonaldTsang avatar Jan 22 '20 02:01 DonaldTsang

@DonaldTsang I edited your post above. No need to post links that are irrelevant to the issue, albeit referencing to publicly-accessible information.

Here I replied exactly in the same way as I did in the issue tracker of the aeneas repository: the PR tab for this repository ( https://github.com/pettarin/forced-alignment-tools/pulls ) has zero open PRs. In the past, I happily merged PRs within a reasonable time ( https://github.com/pettarin/forced-alignment-tools/pulls?q=is%3Apr+is%3Aclosed ). If you want to submit a PR, I will be happy to evaluate it, and merge it if in line with the contents of the repository.

pettarin avatar Jan 22 '20 20:01 pettarin

@MysteryPancake that is indeed a useful suggestion.

pettarin avatar Jan 22 '20 20:01 pettarin

@pettarin the reason I posted the contacts, is that your GitHub has been inactive for the last year (at least in the dashboard) and I was concerned regarding your absense. Hope you can understand.

DonaldTsang avatar Jan 23 '20 05:01 DonaldTsang

Please include the level of granularity or remove aeneas from the list.

In the readme to this repo forced alignment is defined as follows:

Given an audio file containing speech, and the corresponding transcript, computing a forced alignment is the process of determining, for each fragment of the transcript, the time interval (in the audio file) containing the spoken text of the fragment.
A text fragment can have arbitrary granularity:

a paragraph,
a sentence,
a portion of a sentence (i.e., a group of words),
a word, or
a phoneme (i.e., a single sound).

but aeneas does not provide phoneme level alignment. In the issue linked below: https://github.com/readbeyond/aeneas/issues/199

Q: Does aeneas supports phonetic level alignment? A: In short, no. Long answer here:

I and various others as well were mistaken in thinking aeneas was an available option for phoneme level forced alignment.

arcman7 avatar Oct 01 '20 01:10 arcman7