NumPy Iterating Over Array
Overview
This tutorial will teach you how NumPy array iterations are performed. Arrays are iterated using for and while loops. Nditer is a multidimensional iterator object in the NumPy package that can traverse an array efficiently.
We can modify the array value by using an optional parameter of the nditer object called Op flags (operand flags). Broadcasting array iteration is used for iterating an array in Python simultaneously. Cython is also used for iterating over arrays.
Introduction
If we want to iterate over an array in Python in a single iteration, nditer functions with loops like for loop and while loop can access an element. Other features, like tracked indices, remain unchanged. Also, sometimes there is a need to control the iteration order; we have different methods for this, which will be discussed in this article.
We will also look at the nditer iterator object and the nditer iteration order and how to use Cython.
Single Array Iteration
We use for and while loops to iterate over an array in Python. However, we have to perform multiple loops to iterate over a multidimensional array. To avoid using multiple loops, we use a NumPy function called nditer.
The nditer function is used for iterating each array element in a single iteration.
Let's look at the following examples where an array arr is created using the np.arange function and then transformed into three rows and four columns using the np.reshape function.
Using a for loop to iterate:
Code:
Output:
Explanation: The array arr is traversed using two for loops, one inside the other. The iterator i is for rows, and iterator j is for columns.
Using the nditer function:
Code:
Output:
Explanation: The array arr is traversed using the nditer function in a single for loop.
- Instead of using a standard C or Fortran ordering, this iteration chooses an order that matches the array's memory layout.
- Indexing starts from zero, which means the first element has an index 0.
Controlling Iteration Order
Sometimes we need to visit elements of an array in s particular order, regardless of how the array's elements are organized in memory. In order to regulate this feature of iteration, the nditer object offers an *. (Order parameters are discussed under the "Nditer Iteration Order" heading in this article).
How to modify an array's value?
- nditer treats the array's items as read-only by default. It indicates that a value error will be returned if any modification on the values of an array is performed as shown in the code example below:
Code:
Output:
- Using an optional parameter of nditer object called Op flags(operand flags). As a result, this iterator can now modify array elements.
Let us look at the example below, which gives the square of the elements of the array as a modified array.
Code:
Output:
Explanation: Here, each array element is doubled, and the modified array of doubles of the first array is returned.
Iteration using an External Loop
The nditer class constructor has a parameter ‘flags’, which can take the following value:
Parameter | Description |
---|---|
external_loop | Causes values given to be one-dimensional arrays with multiple values instead of a zero-dimensional array |
f_index | Fortran_order index is tracked |
c_index | C_order index can be tracked |
multi-index | Type of indexes with one per iteration can be tracked |
The iterator traverses one-dimensional arrays for each column in the example given below:
Code:
Output:
Explanation: Using an external loop, an array is broken into four parts in order F, i.e., vertical or column-wise. The first column (0,4,8), the second column (1,5,9), the third column (2,6,10), and the last (4,7,11) are printed as output.
How to track an Index or Multi-Index?
With an index, we can access elements of different dimensional arrays. Moreover, we can access both elements and the index of the elements in different orders, like in-memory order, c-order, or fortan order, so that the index can keep track of elements.
Depending on the request, the iterator object keeps track of the index and can be accessed via the index or multi-index properties. Let us look at the below example to understand:
Code:
- % is for formatting tuple variables
- %d is for integer placeholder
Output:
Explanation: In the above example, each element is paired with its index (only column index). The index is inside the "<>" bracket in F order, i.e., indexing is done column-wise. For better understanding, below is the array representation with its index value in F order, inside the () brackets.
0(0) | 1(3) | 2(6) | 3(9) |
---|---|---|---|
4(1) | 5(4) | 6(7) | 7(10) |
8(2) | 9(5) | 10(8) | 11(11) |
As shown in the above matrix, 0 has an index of 0, 1 has an index of 3, and so on.
Iterator tracking Multi_index
Code1:
Output:
Explanation: In the above example, multi-indexing is done by pairing each element with its index. The index is inside the "<>" bracket in default order, i.e., indexing is done as stored in memory. The array representation with its index inside the brackets is below for better understanding.
0(0,0) | 1(0,1) | 2(0,2) | 3(0,3) |
---|---|---|---|
4(1,0) | 5(1,1) | 6(1,2) | 7(1,3) |
8(2,,0) | 9(2,1) | 10(2,2) | 11(2,3) |
As shown above, both rows and columns are used for indexing. Such as element 0 is present in the column and row, element 1 is present in the row and column, and so on.
Alternative Looping and Element Access
For iterating over an array in Python using a single iteration, nditer functions with loops like the for loop and while loop can access an element. Other features, like tracked indices, remain unchanged.
In the following examples, an array is created by the np.arange function and transformed into an array of three rows and four columns with the help of the np.reshape function and traversed using a while loop.
Code#1:
Output:
Explanation: Iterate the above 2-D using a while loop. Each element is traversed with its index; the index is inside "<>" bracket (only column index) in F order, i.e., column-wise. While loop runs until iterr exists, is_next is the variable that checks if a number is present next to the current number. Unless the while loop satisfies these conditions, it will run and execute an output.
Code#2:
Output:
Explanation: Iterate the above 2-D array using a while loop. Each element is traversed with its index, index is inside "<>" bracket (only column index) in F order, i.e., column-wise. While loop runs until iterr exists, is_next is the variable that checks if a number is present next to the current number. Unless the while loop satisfies these conditions, it will run and execute an output.
Buffering the Array Elements
While forcing an iteration order, we noticed that the external loop option might break the elements of an array into smaller chunks because, with constant steps, the elements cannot be visited in the proper order. When developing C code, this is fine. However, in pure Python programming, it can significantly slow down performance.
By using the buffering mode, the smaller chunks can be made larger. This significantly lowers the complexity of the Python interpreter.
Here are some examples to understand: Code:
Output:
Code:
Output:
Iterating as a Specific Data Type
There are situations when treating an array as a different data type than it is stored as essential. Two methods are used for treating a stored array datatype as another.
Temporary copies The entire array is copied using the new datatype, and iteration is then performed on the copied array. However, temporary copies occupy much memory.
Buffering mode Buffering mode creates an extra space called a buffer.
Making temporary copies is less cache-friendly than buffering mode, which lessens the memory usage problem. Buffering is preferred over temporary copying in most circumstances.
In the following examples, an integer datatype is treated as a complex datatype so that the square root of negative values can be taken. If the data type does not match exactly, the iterator will throw an error unless copies or buffering modes are enabled.
Code1:
Output:
Code2:
Output:
Explanation: In the above code example, a copy of the original array is made into a complex data type, and then the square root of negative values is found.
Code3:
Output:
Explanation: In the above code example method, the buffering mode treats the integer datatype of an array as a complex datatype, and then the square root of negative values is found.
- The iterator checks whether a given conversion is allowed using NumPy's casting rules.
- It enforces safe casting as default.
Broadcasting Array Iteration
When we have arrays of different shapes, we try to make both arrays of the same shape by stretching the lower dimension to the higher dimension. Because all operations are performed on arrays of the same element in an element-by-element manner, this technique is known as broadcasting.
A combined nditer object can iterate over two arrays simultaneously if they are broadcastable. Let us take an example of two arrays Arr1 has a dimension of 2x3, and arr2 has a dimension of 1x3. So array arr2 can be broadcast to Array arr1.
Here, the iterator of this type is used.
Code:
Output:
Iterator-Allocated Output Arrays
In NumPy functions, allocating the outputs in a typical case depends on how inputs are broadcast and when they are given. Therefore, it also has an optional parameter called "out" where the result is stored.
The nditer object offers excellent idioms for implementing this method.
To handle operands given in as None, nditer automatically uses "allocate" and "readonly" flags. It means that we only have to give two operands, and the operand handles the rest.
Note: x + y; x and y are operands, and the operator is "+."
We need to present these flags clearly when we are using the "out" parameter because the iterator would default to "readonly" if an array is sent in as "out," and our inner loop would fail.
Every broadcasting activity would result in a reduction if the default setting were "readwrite," a concept that is discussed later in this article. We added the "no broadcast" flag to prevent the output from being broadcast. Because we only want one input value for each output, this is significant.
We will also include the "external loop" and "buffered" flags to ensure everything is complete because these are the performance-related flags that should generally be used.
Every broadcasting activity would result in a reduction if the default setting were "readwrite," a concept that is discussed later in this article. We add the "no broadcast" flag to prevent the output from being broadcast. Because we only want one input value for each output, this is significant.
We will also include the "external loop" and "buffered" flags to ensure everything is complete because these are the performance-related flags that should generally be used.
Let us look at the following code example:
Code:
Output:
Explanation: In the above code, sqr is the function created using "def" parameter for finding the square of the array's values. When we use "out" parameter, operands broadcast together as a shape array inside the sqr function, and arr2 is the same.
However, an error will occur if we use arr2 of different shapes.
Code:
Output:
Outer Product Iteration
Numpy outer() is a python function that computes the extreme level of products such as vectors and arrays.
Any binary operation can be extended to an array operation in the same way that the outer can. The nditer object makes it easy by explicitly mapping the operands' axes. Of course, this can also be done with new axis indexing, but we will show you how to do it directly with the nditer op axes argument and no intermediate views.
We will make a simple outer product by putting the first operand's dimensions before the second operand's dimensions. The op axes parameter requires a single list of axes for each operand and creates a mapping between the iterator's axes and the operand's axes.
Code:
Output:
Explanation: The first operand is one dimension, and the second operand is two dimensions. The three-dimensional iterator will produce two 3-element lists in op axes. The first list selects one of the axes for the first operand and sets the other axes of the iterator to -1, producing the final value [0, -1, -1]. The two axes of the second operand are chosen from the second list, but they must not cross those of the first operand. Its list is [-1, 0, 1].
We can use None rather than make a new list because the output operand maps to the iterator axes as expected.
Reduction Iteration
A reduction occurs when a writeable operand has fewer elements than the whole iteration space.
The nditer object requires that any reduction operand be read-write, and only permits reductions when the iterator flag 'reduce ok' is given.
Code1:
Output:
Things become slightly more complex when reduction and allocated operands are combined. Iteration cannot start until all reduction operands have been initialized to their default values. One approach is to compute sums along the last axis of an array.
Code2:
Output:
How to use Cython for Iterating Over Array?
Cython is a superset of Python and C programming languages written in Python and C. It is designed to provide C-like performance using Python and optional C syntax. Because it creates C code and is compiled by a C compiler, CPython is a compiled language.
- extension .pyx extension is used for CPython code file
To run CPython code, first, you need to install CPython by using the following command: import cython
For type declaration, CPython uses cdef syntax and it is understandable from a C/C++ standpoint.
- Declaring a variable in Python x = 5
- Declaring a variable in CPython cdef int x = 5
In the following examples, the sum of squares is found, first using a simple python program and then a cython program.
Simple python NumPy program:
Output:
Cython program:
Sum_of_squares.pyx is listed as follows:
We are importing cy_sqr function from sum_of_squares.pyx file.
Output:
Explanation:
In the code examples, we created a function for calculating the sum of squares of the elements of an array. Next, we must create a list for the op axes option to provide an "axis" parameter similar to the NumPy sum function.
We substitute the inner loop (j[...] += i*i) with Cython code customized for the float64 dtype to cythonize this method.
There is relatively little verification required when the 'external loop' flag is enabled because the arrays supplied to the inner loop will always be one-dimensional.
Examples
Iterating a One-dimensional Array
We are iterating a one-dimensional array using a For loop.
Code:
Output:
Here, we are iterating a one-dimensional array using a while loop.
Code:
Output:
Iterating a Two-dimensional Array
Iterating a one-dimensional array using a For loop. Iterating each row in 2-D array.
Code1:
Output:
Iterating each element in a 2-D array.
Code2:
Output:
Iterating over a 2-dimensional array is using a while loop.
Code3:
Output:
Iterating Two Arrays Simultaneously
Two arrays can be iterated simultaneously only if they are broadcastable. Let's look at the following code example:
Code:
Output:
Explanation:
Creating two arrays,arr1 of dimension 2x3, by using arange() function, and one is arr2 of dimensions 1x3. When both are iterated simultaneously, arr2 will be broadcasted to array arr1.
Getting Familiar with Nditer Object
Numpy.nditer is an iterator object in the NumPy package.
-
It is a multidimensional iterator object that can efficiently traverse an array. The standard Iterator interface in Python is used to visit each element of an array.
-
Syntax:
Let's look upon code example below:
Code:
Output:
Explanation: We are creating a 2x3 using arange() function and iterating over it using nditer.
Nditer Iteration Order
Nditer Iteration Order is usually preferable to access items of an array in a specific order irrespective of the layout of the elements in memory. Therefore, order='K' is nditer's default order parameter for iteration to keep the existing order.
There are two more order parameters of iteration as follows:
- Order='C' is for the horizontal traversal of array elements.
- Let us look upon the code example below:
Code:
Output:
Explanation: Traversing array in order c, horizontally along rows, i.e., first row(0,5), then 2nd row(10,15), and then last one(20,25).
- Order='F' specifies the vertical traversal of array members in Fortran order.
- Let's look at the code example below:
Code:
Output:
Explanation: Traversing array column wise in order F, i.e 1st column (0,10,20), 2nd column (5,15,25).
Conclusion
- Nditer is a multidimensional iterator object in the NumPy package that can traverse an array efficiently.
- Nditer treats the array’s items as read-only by default. However, we can modify the array value by using an optional parameter of the nditer object called Op flags (operand flags).
- With the help of broadcasting, a combined nditer object can iterate over two arrays simultaneously.
- Cython is a computer language that is a superset of Python and C and is written in Python and C. It uses Python syntax and optional C to provide C-like performance.
- While and for loops can also be used to iterate an array in Python.