mathjs icon indicating copy to clipboard operation
mathjs copied to clipboard

Indexing with an array of booleans

Open dvd101x opened this issue 2 years ago • 5 comments

It would be nice to index with an array of booleans, I'm including some examples and reasoning.

a = [4, 5, 6]
a[[true, false, true]] # I would like for it to return [4, 6]

From this I get IndexError: Index out of range (0 < 1)

The desired result helps in other languages like Matlab/Octave or Python-Numpy.

Example in Matlab/Octave

% Let's say I want to filter only even numbers
a = [4, 5, 6]
even = mod(a,2)==0	% returns [1, 0, 1]
a(even)			% returns [4, 6]

Example in Python-Numpy

a = np.array([4, 5, 6])
even = a%2 == 0		# returns array([ True, False,  True])
a[even]			# returns array([4, 6])

Desired example in MathJS

a = [4, 5, 6]
even = a%2 == 0		# returns [true, false, true]
a[even] 		# desired output [4, 6]

I understand this might not be an easy task, and also there is a function filter for similar issues, nonetheless I bring this topic for considerations as I think It could help extending even more the capabilities of this library.

dvd101x avatar Jul 26 '22 19:07 dvd101x

Thanks, that is indeed a useful way to allow you to filter array items.

I think it conflicts with the current behavior of a[indices] using indices to to pick the values at the specified indices, and not a mapping of zeros determining a filter value for every item in the array. I'm not sure how we could support both with this same syntax. Maybe we can come up with a different syntax, something like:

a = [4, 5, 6]
even = a%2 == 0		# returns [true, false, true]
filter(a, even)		# desired output [4, 6]

josdejong avatar Jul 27 '22 13:07 josdejong

Hi Jos!

Thank you for looking into this! Yes filter works very well. I'm gladly surprise of the capabilities of this library

Regarding the use of filter, I did some tests by defining functions and using inline functions. I was surprised that it can take any undefined variable as the input for the inline function. filter([1, 2, 3, 4], myVar>2) #returns [3, 4]

In my experience (not that much) is more common for filter to use a function as an argument. Here is an example in Pandas-filter and more common for indexing to be used with an array of booleans, here is also an example in Pandas-subset to filter.

So I think indexing might benefit from interpreting an array of booleans in some very particular cases like for assigning values.

Let's say that we now want to add 0.1 to the even numbers. We could do it by indexing like so if we already knew what is the index of the even numbers

a = [4, 5, 6]
indexOfEven = [1, 3]
a[indexOfEven] = a[indexOfEven] + 0.1
a						# returns [4.1, 5, 6.1]

Or we could get that index of even numbers with filter

a = [4, 5, 6]
indexOfEven = filter(1:size(a)[1], a[x]%2 == 0)	# returns [1, 3]
a[indexOfEven] = a[indexOfEven]+0.1
a 						# returns [4.1, 5, 6.1]

If it was possible to index with an array of booleans (of the same size) then it could be possible to write

a = [4, 5, 6]
a[a%2 == 0] = a[a%2 == 0] + 0.1
a 						# desired return [4.1, 5, 6.1]

Similar to Matlab

a = [4, 5, 6]
a(mod(a,2)==0) += 0.1			% returns [4.1, 5, 6.1]

I found a few examples

I understand that indexing is already very powerful as each index can be an integer, a vector of integers and a range (including the keyword end for the maximum index possible).

So if the index is a vector of the same size and it contains only booleans, then it could apply a conversion to make [true, false, true, true] into [1, 3, 4] and the rest would be the same as it is.

So we could write something like

mass = [12, 5, 15, 20]
# Get all that masses that are greater than 10
mass[mass>10] # expected return [12, 15, 20]

force = [5, 10, 20, 15]
# Get all the forces where the masses are less than 15
force[mass<15] # expected return [5, 10]

dvd101x avatar Jul 27 '22 16:07 dvd101x

Thanks for your inputs. The notation filter([1, 2, 3, 4], myVar>2) is just a shorthand for passing an actual function as second argument, like filter([1, 2, 3, 4], f(myVar) = myVar>2).

Interesting idea to extend the index to support functions like force[mass<15]. It feels to me like eye-candy for doing filter(force, mass<15), I'm not sure it really adds value. It looks nice though :). I would at least love to extend the filter function thought to support a matrix as second argument, like filter(a, even). I still have to give this a a bit more thought.

josdejong avatar Jul 30 '22 09:07 josdejong

Thank you for considering this,

Adding this capability to do filter(force, mass<15) sounds very cool too.

As a summary I found references on Octave, Numpy and Pandas to do stuff similar to force[mass<15] and filter(force, mass<15).

I just found a nice explanation on Numpy, so here it is. Numpy Filter Array

Again thanks for considering this

dvd101x avatar Jul 30 '22 15:07 dvd101x

Seems as though the exact desired API is not quite settled here yet, although it appears that at least allowing filter to take a collection of booleans as its second argument is "officially sanctioned" if someone wants to implement it. Hence I will add the design decision label as well, at least until it's ironed out what else if anything might be implemented in response to this issue.

gwhitney avatar Jul 31 '22 04:07 gwhitney