[SOLVED] Extract 2d ndarray from arbitrarily dimensional ndarray using index arrays


This Content is from Stack Overflow. Question asked by deemel

I want to extract parts of an numpy ndarray based on arrays of index positions for some of the dimensions. Let me show this on an example

Example data

dummy = np.random.rand(5,2,100)
X = np.array([[0,1],[4,1],[2,0]])

dummy is the original ndarray with dimensionality 5x2x100. This dimensionality is arbitrary, it could as well be 5x2x4x100.
X is a matrix of index values, here X[:,0] are the indices of the first dimension of dummy, X[:,1] those of the second dimension. The number of columns in X is always the number of dimensions in dummy minus 1.

Example output

I want to extract an ndarray of the following form for this example



If the number of dimensions in dummy were fixed, this could just be done by dummy[X[:,0],X[:,1],:] . Sadly the dimensionality can be different, e.g. dummy could be a 5x2x4x6x100 ndarray and X correspondingly would then be 3×4 . My attempts at dealing with it have not yielded the desired result.

  • dummy[X,:] yields a 3x2x2x100 ndarray for this example same as dummy[X]
  • Iteratively reducing dummy by doing something like dummy = dummy[X[:,i],:] with i an iterator over the number of columns of X also does not reduce the ndarray in the example past 3x2x100

I have a feeling that this should be pretty simple with numpy indexing, but I guess my search for a solution was missing the right terms for this.
Does anyone have a solution to this?


I will try to provide some explainability to @Michael Szczesny answer.

First, notice that if you have an np.array with dimension n and pass m indexes where m<n, then it will be the same as using : in the dimensions >=m. In your case, for example:

dummy[(0, 0)] == dummy[0, 0, :]

Given that, note that you can also pass an array as an index. Thus:

dummy[([0, 1], [0, 0])]

It would be the same as:

np.array([dummy[(0,0)], dummy[(1,0)]])

You can validate that using:

dummy[([0, 1], [0, 0])] == np.array([dummy[(0,0)], dummy[(1,0)]])

Finally, notice that:

# (array([0, 4, 2]), array([1, 1, 0]))

You are here getting each dimension as an array, and then you will get:


Which is the same as:


Edit: Instead of using (*X.T,), you can use tuple(X.T), which for me, makes more sense

This Question was asked in StackOverflow by deemel and Answered by Bruno Mello It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?