xforms-spec icon indicating copy to clipboard operation
xforms-spec copied to clipboard

Add function to remove duplicates from a list of values

Open lognaturel opened this issue 8 years ago • 4 comments

De-duplicating values can be helpful to do things like generating random numbers without duplicates as described in this mailing list thread.

SurveyCTO has shared implementations for several useful functions here including de-duplicate() which takes two strings as parameters: a delimiter and a string to de-duplicate. For example, passing in "," and "value1,value2,value1" gives "value1,value2" as its result.

XPath 2.0 defines a function called distinct-values which operates on a sequence.

It does not appear that Dimagi-xforms includes this functionality.

lognaturel avatar Dec 12 '16 13:12 lognaturel

I'd be in favor of adding distinct-values() as described in XPath 2.0 spec. Seems useful for various randomization tricks. 👍

MartijnR avatar Dec 12 '16 17:12 MartijnR

@lognaturel do you still sense there is a demand for it? If so, let's just add this XPath 2.0 function. If not, let's close (I have not had a demand for it yet).

MartijnR avatar Feb 19 '18 21:02 MartijnR

I'm (very) late to the party, but I was looking for something like this for quick validation that a data collector didn't put the same information in multiple times in a repeat.

A concrete example: we are collecting blood vials that are barcoded with unique identifiers. For each vial, the barcode is scanned. To ensure that the user didn't scan the same vial multiple times, it would be nice to do something like "count(de-duplicate(vails)) != count(vials)". That way, we would know that the user accidentally scanned the same sample twice.

I've seen this method of detecting duplicates, but it's fairly complicated and this seems like a much simpler solution.

jniles avatar Jun 19 '19 09:06 jniles

A function that removes duplicates from a string would be very useful, particularly for selecting more than 1 random number from a set without replacement. This functionality could be used for say sampling 2 or more household members from a household of 8. The deduplicate() a function in SurveyCTO that's described here. and here.

@jniles do you think that method here could work in the same way as a de-duplicate() function?

SLGiHub avatar Jun 23 '22 20:06 SLGiHub