Dissecting the NumPy Indexing Mechanism

March 3, 2019 - 9 minute read - Deep learning , Tensor


Typically people tend to treat arrays using the intuition gained in linear algebra, where arrays of numbers are treated as matrix-like objects. However, in this deep learning era, this intuition cannot be easily generalized to higher dimensions (4 or higher), nor explain how the numbers are stored. Thus, we introduce a new graphical notation based on the hierarchy of array elements that helps:

  • Understand how array elements are stored
  • Explain the three operations, i.e., aggregation, reshape and stack, in NumPy
  • Solve some practical problems

The passage is organized as follows: 1) in the second section, we will discuss the graphical representation and the indexing mechanisms; 2) in the third and fourth section, we explain the aggregation and reshape and stack operation respectively; 3) we give an example showing the usefulness of such representations.

A Novel Graphical Representation

In this section, we will build up a graphical notation for NumPy arrays from simple cases.

Fig1: Graphical Representations of ArraysLet’s begin with some simple cases. The figure above gives examples of the graphical representation. We want to highlight some points:

  • array(1) is a single number and array([1]) is an array. A single number cannot be indexed. And an array is indexable, which combines objects and assigns indices to them.

  • For objects in every position of an array, they are called elements. An element can be a single number, as indicated in array([1]) and array([1,2]). The element can also be arrays, shown in array([[1]]). High dimensional arrays are built upon this recursive definitions.
  • The dimension for an array equals to the depth of recursion. For example, array(1) have 0 dimension and array([[1]]) is 2-dimensional.

Fig2: Simplified Notation

We try to simplify the notation. As indicated above, we use a box to denote a number and indicates its index in the right-bottom corner.

Fig3: Simplification Examples 1

The hierarchy or dimension is illustrated by the number of square borders and the indexes. 1-dimensional array([0]) has a single boundary while 2D array([[1]]) have two boundaries. And the index of array([[1]]) becomes a 2D tuple. Based on such simplified representation, it is easier to represent some high dimensional arrays, e.g., array([[[1]],[[2]]]).

Fig4: Simplification Examples 2

For array elements who contain more than one sub-elements, we can simplify them accordingly, e.g., the graph for array([1,2]) and array([[1,2],[3,4]]) shown above. Specifically, in the array([[1,2],[3,4]]) case, we arrange the elements like a 2D matrix for easier understanding. But we can also stack the elements in one row, just like array([1,2]) case.

The Aggregation Operation

In this section, we explain the aggregation operation in NumPy. Aggregations operations, like the sum function, generate aggregated description for the input array. However, for high dimensional arrays, the aggregation operation can be processed in various levels, i.e., different axes in NumPy. This axis system is sometimes confusing based on traditional intuitions, and it can be clearly explained using the proposed notation.

We firstly explain and set up the intuition in 1D and 2D arrays and apply such understanding in 3D arrays.

Aggregation in 1D and 2D Arrays

Let’s begin with simple cases in 1D and 2D arrays. And we will use the sum function as an example for various aggregation functions.

Fig5: Aggregation for 1D array

For a 1D array, it has only 1 axis, and the aggregation can only be operated one direction, i.e., the 0 axis in NumPy. Shown in the graph above, the 0-axis aggregation decomposes the array and add the elements together. Note: the output value for np.sum(array([1,2]), axis=0) is a single number but not an array, because each element is a single number.

Fig6: Aggregation for 2D array, axis=0

Let’s move to 2D arrays. Similarly, the 0-axis aggregation is broken down the outer arrays and add the elements together. The addition here is the element-wise vector addition, i.e., add the elements in the corresponding position. The obtained vector, in this case, is a 1D array.

Fig7: Aggregation for 2D array, axis=1

If we perform the aggregation operation in the second direction, i.e., the 1 axis, we will notice some magic here. The operation performs 0-axis aggregation on every element of the outer array and combines the results. For example, calculating np.sum(array([1,2],[3,4]), axis=1) is equivalent to

    np.sum(array([1,2]), axis=0),
    np.sum(array([3,4]), axis=0)

. The clever trick here not only perfectly align with human intuition in 2D examples but also help explain complicated case in higher dimensions.

Aggregation in 3D and High-dimensional Arrays

Fig8: Aggregation for 3D array, axis=0 and axis=1

When moving to 3D arrays, for 0-axis and 1-axis aggregation, we directly apply the conclusion from the previous section and get the correct results. But how about 2-axis aggregation? Can we generalize the conclusion above?

Fig9: Aggregation for 3D array, axis=2

The answer is yes. Similarly to 1-axis aggregation, which transforms a 1-axis aggregation to performing 0-axis aggregation in each element, the 2-axis aggregation is equivalent to applying 1-axis aggregation to each element.

The reduction idea easily illustrates the confusing 3D aggregation computation and can be further utilized to explain the axis system in a more complex array.

The Reshape and Stack Operation

We continue our journey to understand the reshape and stack operation in NumPy with our representation. These operations are hard to understand when performing in higher dimensions. And we will show our representation can clearly explain these operations.


Fig10: Understanding the Reshape Operation

Let’s begin with the simple 1D array array([1,2,3,4,5,6]). The lefthand side sub graphs show how to reshape the array to 2D arrays. reshape(2,3) and reshape(3,2) tells how many numbers of elements in the outer and inner arrays. And subsequently, it cuts the original sequence given the new shapes. Therefore, the first element of the resultant vector res[0,...] is a 3-element and 2-element array for reshape(2,3) and reshape(3,2), respective.

The righthand side shows the results of converting the input array in the 3D arrays.

  • reshape(1,2,3) is the starting case. It says transform the array into a 3D array, whose element is a 2D array. Thus, res[0,…] is the same as the output of reshape(2,3).
  • reshape(3,1,2) says transforming the array into a 3 element array, whose elements are 2D arrays with shape (1,2). Therefore, the output res is array([[[1,2]],[[3,4]],[[5,6]]]). Similarly, we can get the result for reshape(3,2,1). It is array([[[1],[2]],[[3],[4]],[[5],[6]]]).
  • Compared to the result of reshape(3,2), array([[1,2],[3,4],[5,6]]), the last two results elements are 2D arrays with shape (1,2) or (2,1) instead of 1D arrays.


Fig11: Understanding the Stack Operation

When stacking arrays, we can also specify the axis for the arrays, and it is sometimes confusing. Two common questions are, what’s the dimension it stacks and how to stack in that dimension. In this section, we will use the three arrays $K_1$, $K_2$, and $K_3$ shown above to illustrate such problems.

Fig12: Understanding the Stack Operation 2

0-axis stacking is the simplest stacking case. The stacking dimension is a new dimension, and it generates a new array and assigns $K_1$, $K_2$, and $K_3$ for each element.

Fig13: Understanding the Stack Operation 3

For 1-axis and 2-axis stacking, it is more complicated. Let’s begin with the 1-axis stacking. Based on the idea in 0-axis stacking, 1-axis should stack the array in the 2nd dimension. And stacking in the 2nd dimension means we enlarge the 1D 2 element arrays into a 2D arrays with (3,2) shape. For example, array([0,1]) becomes array([[0,1],[1,2],[2,3]). The newly added elements are from the same position in other input arrays. Thus, after answering the two question, we can construct the array shown above.

For 2-axis stacking, the strategy is similar. The 3rd dimension is a single number element. Therefore, we enlarge it into a 1D array with 4 elements. In this case, we transform array(0) into array([0,1,2]). And, as the same, the newly added elements are from the same position in other input arrays.

One Example: Loading CIFAR-10 Data

CIFAR-10 is an important image dataset in Computer Vision and Deep Learning nowadays. The image data stored for python is a 2D array with the shape. It is sometimes hard to devise the right way to convert the data to a regular image format without a clear understanding of the NumPy. But with the help of our aforementioned representation, we can easily find the correct way to load the data. Lets’ try and see.

According to the website description, the data is stored 10000x3072 NumPy array of uint8s in row-major order. Each row is a 32x32 color image, with 3 1024 arrays stored in the RGB sequence. (Each row of the array stores a 32x32 color image. The first 1024 entries contain the red channel values, the next 1024 the green, and the final 1024 the blue. The image is stored in row-major order so that the first 32 entries of the array are the red channel values of the first row of the image.)

Fig14: The MNIST dataset Example

Thus, we can quickly build up a representation accordingly. Take a row in the input array as an example. Firstly we can use the reshape(3,-1) to split the row into 3 different color channels. And, based on the results, we can use reshape(3,32,32) to transform each element into a 2D array. Finally, we use np.moveaxis() to change the output array into a channel-last representation, which is compatible with the matplotlib input requirements. The code is shown below.

import numpy as np 
import matplotlib.pyplot as plt
def unpickle(file):
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

data_dir = "./cifar-10-batches-py/"
train_data_dirs = [data_dir+"data_batch_{}".format(i) for i in range(1,6)]
raw_data = [unpickle(path) for path in train_data_dirs]
train_data = np.vstack([batch[b'data'] for batch in raw_data])
train_data = np.moveaxis(train_data.reshape(-1,3,32,32),1,-1)


In this passage, we propose a novel representation for understanding NumPy arrays. We explain such representation through examples and demonstrate the efficiency through functions and concrete examples.