forced-alignment-tools
forced-alignment-tools copied to clipboard
Suggestion: Include granularity
As stated in the README:
A text fragment can have arbitrary granularity: a paragraph, a sentence, a portion of a sentence (i.e., a group of words), a word, or a phoneme (i.e., a single sound).
This information would be useful to include in the table.
While aligners such as Gentle and SPPAS allow phone-level alignment, others such as aeneas can only perform word-level alignment.
Perhaps there could be a column indicating the granularity of each aligner?
i think the Maximum length of audio and the granularity can be really useful
Can someone here please start editing the page? It appears that @pettarin has been inactive for one and a half years same goes for Aeneas which is co-owned by him. EDIT by AP: removed link to an irrelevant page His email would be EDIT by AP: removed email address and his twitter is EDIT by AP: removed twitter username If it is possible please contact him.
CC to @pfriesch @peteruhrig because this project is in a bad state.
@DonaldTsang I edited your post above. No need to post links that are irrelevant to the issue, albeit referencing to publicly-accessible information.
Here I replied exactly in the same way as I did in the issue tracker of the aeneas repository: the PR tab for this repository ( https://github.com/pettarin/forced-alignment-tools/pulls ) has zero open PRs. In the past, I happily merged PRs within a reasonable time ( https://github.com/pettarin/forced-alignment-tools/pulls?q=is%3Apr+is%3Aclosed ). If you want to submit a PR, I will be happy to evaluate it, and merge it if in line with the contents of the repository.
@MysteryPancake that is indeed a useful suggestion.
@pettarin the reason I posted the contacts, is that your GitHub has been inactive for the last year (at least in the dashboard) and I was concerned regarding your absense. Hope you can understand.
Please include the level of granularity or remove aeneas from the list.
In the readme to this repo forced alignment is defined as follows:
Given an audio file containing speech, and the corresponding transcript, computing a forced alignment is the process of determining, for each fragment of the transcript, the time interval (in the audio file) containing the spoken text of the fragment.
A text fragment can have arbitrary granularity:
a paragraph,
a sentence,
a portion of a sentence (i.e., a group of words),
a word, or
a phoneme (i.e., a single sound).
but aeneas does not provide phoneme level alignment. In the issue linked below: https://github.com/readbeyond/aeneas/issues/199
Q: Does aeneas supports phonetic level alignment? A: In short, no. Long answer here:
I and various others as well were mistaken in thinking aeneas was an available option for phoneme level forced alignment.