VocableTrainer-Android icon indicating copy to clipboard operation
VocableTrainer-Android copied to clipboard

Feature request : prevent duplicate entries

Open daufinsyd opened this issue 8 years ago • 6 comments

Hi :) could you add an entry to prevent / search for duplicate entries ?

daufinsyd avatar Oct 22 '17 15:10 daufinsyd

Hi, de-duplication of text is a tricky thing once you're past binary equal values. For example case insensitive detection can be a nightmare in non latin characters. (Lang specific code / collation has to be selected by the user / ä,ü,ö have at least two binary representations in UTF ) What exactly were you searching for, de-duplication on import or in the editor itself ?

0xpr03 avatar Oct 23 '17 16:10 0xpr03

De-duplication in the editor. When I add a word I have to check whether it already exists in a list. I didn't checked out the code however if words are stored in a String, isn't it feasible to use java String.compare method to achieve such a thing ?

daufinsyd avatar Oct 23 '17 17:10 daufinsyd

I wouldn't check this via java's comparator as this'll get really slow, but yeah, that's the easiest way to do

0xpr03 avatar Oct 23 '17 18:10 0xpr03

I've thought about some GUI changes and DB functions to implement this, but I've not come to good terms for this. There are several reasons for this: Depending on your vocable set you have multiple entries with the same column A entry but different column B entries (or vice versa). ( We could start discussing a 1:N or N:N Database relation of vocables as better structure instead of 1:1 but this would end up in CSV incompatibilities for simple sets.)

A | B
Fahrstuhl | lift
Fahrstuhl | elevator
Fahrstuhl | lift cage

For these people you want duplicate searching for exact A:B matches. Some people may also want to know when the same Column A/B entry already exists, because they have a unique A and unique B key. (or only A..). The complexity can be getting pretty high. So you already end up with three options: Check A&B together / check for duplicates in A/B exclusive. Next problem is, how do you want to tell the user that there are duplicates ? After he has typed them inside ? A list of search results directly under the input field ? When he starts typing or afterwards ? Also: Searching over the complete set is pretty cpu intense, at least when it's done while typing.

TLDR: There are many options and I could just pick out one of it and ignore everything else, but I'm not the end user here. So I want some feedback on this before I'm going to re-change the DB, making it backwards incompatible.

0xpr03 avatar Nov 18 '17 21:11 0xpr03

Indeed it's tricky. Currently when I read a new word, I search for the translation in my language and write it with all the possible translations.

Krankenversicherung | assurance maladie

Here the most important thing (to me) is to not write the german term two times (if I don't remember whether I already wrote it). But if there is many translations I write them as follows

sanf | léger, doux

Of course it would be perfect (for the end user) if it was like pons.de : each term is linked with many other terms :

sanft > léger, doux doux > sanft, weich, suß

but certainly hard to implement.

Wait for other feedbacks; for me a warning telling the user if the left word already exists in the list (whatever is in the right column) would be a huge improvement :-)

daufinsyd avatar Nov 20 '17 13:11 daufinsyd

Of course it would be perfect (for the end user) if it was like pons.de : each term is linked with many other terms

this is the N:N / 1:N relation problem I've talked about, possible but kinda overkill, leaving problem with CSV import/export etc

but tbh I'm already thinking about ways to rewrite all the crucial parts to support this, though this is really not a priority

Wait for other feedbacks; for me a warning telling the user if the left word already exists in the list (whatever is in the right column) would be a huge improvement :-)

VA is CSV import/export compatible on purpose: You can write your stuff in libreoffice/excel and import it, this way you have the full search functionality of your office program & much more computing power > better response times.

0xpr03 avatar Nov 20 '17 13:11 0xpr03