typit icon indicating copy to clipboard operation
typit copied to clipboard

Clarify license and copyright of dictionaries

Open skangas opened this issue 2 years ago • 8 comments

Hi,

What is the copyright and license on the various dictionary files? I couldn't find any documentation about that in the repository.

Thanks!

skangas avatar Jan 07 '22 03:01 skangas

I'm not sure. I have only added one dictionary myself—the English one. I think I found it online but later modified a little bit. Should it count as my own creation? I'm not sure I can trace where I took it from originally. Dictionaries for other languages have been added by other contributors:

  • German by @erdmenger
  • French by @ramnes
  • Russian by @aadcg

Hopefully they can comment on their origins.

mrkkrp avatar Jan 07 '22 14:01 mrkkrp

When I search for "1000 most common English words" now I get various pages but the order of words there differs from the dictionary in this repository (they are supposed to be ordered by frequency of use). This makes it hard to identify where I took the words from.

mrkkrp avatar Jan 07 '22 14:01 mrkkrp

For the French dictionary, IIRC I just dumped the words in use in a 10fastfingers test.

ramnes avatar Jan 07 '22 15:01 ramnes

Honestly, I don't remember and I can't find results that match the current order.

Some references:

Open Russian mentions that the data is licensed under CC BY-SA.

EDIT: I found the original source. I can't find information about the license.

aadcg avatar Jan 07 '22 20:01 aadcg

Hello there!

I enjoy so much this piece of software.

But reading about this issue I also have my concerns about collaborating with diccionaries. My mother tongue is spanish and since we have RAE that already provides a list of most used forms on this lenguage, the licence is not clear to me that I am not a lawyer.

So just for fun I am trying to write a package to extract the most used words on a piece of text. Being this piece of text already public domain we should no have any trouble.

Let me know if it makes sense of this context.

Regards!

texaco avatar Aug 13 '22 10:08 texaco

and here you are https://codeberg.org/mtex/words-list-generator

Since those dictionaries are not exclusively language related but a piece of literature on a given language I have come with a file name like "dictionary.book-name.language.txt"

For a dictionary generation you can use:

emacs -Q --batch --load=/path/to/words-list-generator/words-list-generator.el \
         --eval='(words-list-generator-make-dictionary "/path/to/ebook.txt" "language")'

Please let me know if a "dictionary.quijote.es" might be welcome to this project!

texaco avatar Aug 19 '22 11:08 texaco

Hey, you are most welcome to open a PR!

mrkkrp avatar Aug 19 '22 11:08 mrkkrp