tatoeba2
tatoeba2 copied to clipboard
Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.
## Feature suggestion: Be able to search for "indirectly linked, but not directly linked into a given language, to make it easier for members who are searching for sentences that...
**To Reproduce** 1. Be logged in as admin. 2. Go to https://tatoeba.org/en/sentences/show/3991877 3. In the audio section, unchanged "Is enabled". 4. Click "Save". 5. Nothing happens. **Expected behavior** The audio...
Hi. Some of the dump files on the Downloads page are incorrectly formatted. The details field on the user_languages.csv file, for example, allows tabs and newlines, which should not be...
Steps to reproduce the behavior: 1. Go to https://tatoeba.org/en/api_v0/search?query=%22%E5%AD%A6%E6%A0%A1%22&from=jpn&to=eng&page=1&limit=100 Expected behavior: I expected the number of results returned to be 100, but it was 10. Strangely, the perPage property does...
Note that this search for sentences in Berber with audio files shows a lot of Kabyle sentences. https://tatoeba.org/en/sentences/search?orphans=any&sort=created&has_audio=yes&from=ber (1,000 results out of 4,659 occurrences) This search for Kabyle sentences with...
**Story** When I am working on a slightly complex advanced search and I want to tweak the search string because I don't like the results I'm getting, I often change...
CK reported that it sometimes took several minutes (up to 19 minutes) to import audio (see https://github.com/Tatoeba/tatoeba2/issues/2955#issuecomment-1163012139). I noticed that when we run a worker, even if the `runworker` command...
From [Circular 33 of the United States Copyright Office](https://www.copyright.gov/circs/circ33.pdf): > Words and short phrases, such as names, titles, and slogans, are uncopyrightable because they contain an insufficient amount of authorship....
Yet another import suggestion, woo! (see: #1762, #2256, #2637, #2786) [GlobalVoices](https://globalvoices.org/) is a multilingual news site with all articles published under a free license. An indiscriminate import of every single...
The Common Voice project uses CC0 sentences from various sources. There's a dump of them on GitHub: https://github.com/common-voice/common-voice/tree/main/server/data Unfortunately, most of the `sentence-collector.txt` sentences (which is the main attraction) are...