Convert Different Python Data Structures to Numpy Array?
Can we convert different Python data structures to NumPy arrays? Yes, we certainly can.
When programming in Python, we may need to convert several Python data structures to a NumPy array. The technique we'll employ is entirely dependent on the data structure we wish to convert to a NumPy array. Here are some examples of typical data structures and the methods we'll use to convert them to NumPy arrays.
Some may wonder, "Why NumPy?" NumPy allows for more efficient data storage and processing for mathematical calculations. NumPy stores data in significantly less memory. NumPy may dramatically speed up your code. Furthermore, several of Python's most popular data science packages accept NumPy arrays as inputs and provide them as outputs. As a result, it's advisable to become acquainted with them.
Different ways to convert from a Python Data structures to Numpy Array :
- list/tuple to numpy array using np.asarray() or np.array() method
- set to numpy array using np.array() or numpy.fromiter() method
- string to numpy array using np.fromstring(mystr, dtype=int, sep='') method
- Python dictionaries to numpy array using np.array() method
- Pandas dataframe to numpy array using df.to_numpy() method
- TensorFlow TensorProto to numpy array using tf.make_ndarray(existing_proto_tensor) method.
- PyTorch Tensor to numpy array using existing_tensor.numpy method.
- SciPy sparse matrix to numpy array using existing_sparse_matrix.toarray method.
In this article, we will go through each of these strategies in detail. Without further ado, let's get started with converting lists / tuples to NumPy arrays.
Before we continue, if any of the data structures sound foreign, it is highly advised that you first look over them and get acquainted with them.
List/Tuple to NumPy array
It is possible to create a NumPy array from a list. First, let's compile a list of student heights for our case.
Code :
Let's transform our list to a NumPy array using the np.asarray() or np.array() methods, so we can access all of NumPy's ndarray function goodness. Before we go into the code, let's have a look at the syntax of the "numpy.array()" and "numpy.asarray()" methods
Syntax:
numpy.array(object, dtype=None,**kwargs)
Parameters:
Sr No. | Parameter Name | Parameter Description |
---|---|---|
1 | object | Any object that supports the array interface. If an object is a scalar, it returns a 0-dimensional array containing the object. |
2 | dtype | The array's preferred data type. |
3 | **kwags | other keyword arguments. |
Syntax :
numpy.asarray(object, dtype=None, **kwargs)
Code :
Output :
That's great, isn't it?
Note :
The key distinction between the two functions is that numpy.array() will create a clone of the original object, whereas numpy.asarray() will reflect the modifications in the original object. i.e: When an array is copied using numpy.asarray(), the modifications made in one array are mirrored in the other array as well, but the changes are not shown in the list from which the array is formed. This, however, does not occur with numpy.array ().
Let's look at how to convert a set to a numpy array next.
Set to NumPy Array
Let us take the list created in the previous part and transform it to a set data structure using a set operation in Python.
Code :
Now we will use the numpy.array() function to convert the supplied set into a NumPy array.
Code :
Output :
This was not the outcome we were hoping for. Let us resolve this.
Code :
Output :
As you can see, we are transferring the data structure back and forth many times, which is not an optimal approach to the problem. We will use numpy.fromiter(), which is a much better solution. Syntax :
numpy.fromiter(iter, dtype=None,**kwargs)
The iter argument represents an iterable object that provides data for the array.
Code :
Output :
String to NumPy Array
The numpy.fromstring() method creates a NumPy array from text data in a string.
Syntax:
numpy.fromstring(string,dtype=None,sep,**kwargs)
Over here, the string parameter represents a string holding the data, and the sep argument represents a string dividing the data.
Let us better understand this by using an illustration.
Code :
Output :
Dictionaries to NumPy array
To convert data from a Dictionary format to a NumPy array format, we utilize np.array() and a list type conversion. A fantastic one-liner!
Code :
This function turns the contents of totals into a list, List. This List is then converted into a NumPy array, and the results are stored in the variable Info. The contents are finally displayed on the terminal.
Output :
Convert from Another Library
We'll frequently be utilizing another library and have a data structure that we'd like to transform into a NumPy array for processing. Let's explore how we can accomplish that using pandas.
Pandas Dataframe to NumPy Array
Pandas is a well-known data manipulation library. It is a NumPy extension. Let's create a pandas DataFrame and then convert it to a NumPy array.
Code :
The to_numpy() function is the official method for converting a pandas DataFrame or Series to a NumPy array. Before we go any farther, let's have a look at the syntax.
Syntax:
DataFrame.to_numpy(dtype=None, **kwargs)
Code :
Output :
Quite simple! Let's have a look at TensorFlow now.
TensorFlow TensorProto to NumPy Array
TensorFlow is a well-known deep learning framework. It takes a few steps to transform a TensorFlow tensor to a NumPy array.
Let's start by creating a TensorFlow tensor object.
Code :
Output :
Now we'll turn it into a proto tensor.
Code :
Output :
It is now possible to turn it into a NumPy array.
Code :
Output :
This is why PyTorch is gaining popularity. Now let's have a look at PyTorch.
PyTorch Tensor to NumPy Array
The second major deep learning framework is PyTorch. Compared to TensorFlow, it is a little more Pythonic. Let's construct a PyTorch tensor.
Code :
Output :
And now let's turn it into a NumPy tensor.
Code :
Output :
That was delightfully uncomplicated.
PyTorch and NumPy work well together. It is important to note that after transforming between Torch tensors and NumPy arrays, their underlying memory addresses will be shared (assuming the Torch Tensor is on GPU(or Graphics processing unit)), and altering one will affect the other.
SciPy Sparse Matrix to NumPy Array
SciPy sparse matrices are excellent for storing data that is primarily made up of 0s or other lone values. For instance, a SciPy sparse matrix is the data structure that results from one-hot encoding an array in scikit-learn.
You may need to transform a SciPy sparse matrix into a NumPy array in order to examine it or perform a specific function on it.
We have imported OneHotEncoder from scikit-learn, so let's utilize it to generate a sparse matrix.
Code :
Output :
The minimal memory consumption is excellent, but what if I want to view the results? I can just use toarray to return a NumPy array.
Code :
Output :
Note :
to_numpy is used by pandas, whereas toarray is used by SciPy. This is a common source of confusion for many of us.
These are only a few methods for converting a Python data structure to a NumPy array. There are a plethora of alternative techniques that remain unexplored; you may read the official NumPy documentation to get a bird's-eye perspective of all the numerous methods available.
Conclusion:
In this blog, we learned how to convert from:
- a list/tuple to a numpy array using the np.asarray() or np.array() methods.
- set to a numpy array with np.array() or numpy.fromiter()
- Using the np.fromstring(mystr, dtype=int, sep=") function, convert a string to a numpy array.
- np.array() converts Python dictionaries to numpy arrays.
- Using the df.to_numpy() function, you may convert a Pandas dataframe to a numpy array.
- tf.make ndarray(existing proto tensor) converts a TensorFlow TensorProto to a numpy array.
- Using the existing tensor.numpy function, convert a PyTorch Tensor to a numpy array.
- Using the existing sparse matrix.toarray function, convert a SciPy sparse matrix to a numpy array.