Truncating, Deletion in DataFrames
Overview
If we want to add or truncate any Series or even a DataFrame, we can use the truncate() function provided by the Pandas library. And this method of adding some data is known as truncating. The Pandas truncate() function returns a Series or a DataFrame. If we do not want some rows or columns to be present in a DataFrame, then we can use the del keyword or the Pandas drop() method or the Pandas pop() to delete the unwanted rows and columns. The Pandas drop() function returns a DataFrame, either updating the invoked one or a new one depending upon the specified parameter.
Introduction
Before learning how the Pandas delete column and truncating works, let us first get a brief introduction to the Pandas module.
Pandas library is a highly optimized library that provides us with various data analytics tools and data structures like DataFrames and Series to deal with large sets of data present in the form of tables. Pandas library is one of the fastest libraries built on top of the NumPy library in Python that provides us a lot of tools and functions to perform data analysis in a very efficient manner.
Now truncating means adding some series or DataFrame with the predefined DataFrame. We will be learning about how we can truncate Series and DataFrames to a predefined DataFrame. After that, we will shift our focus to the deletion of columns and rows from the DataFrame.
Let us now learn how the Pandas delete column works and truncating works.
What is Truncating?
As we know that a DataFrame is a multidimensional table comprising a collection of columns known as a series. In simple terms, we can define a DataFrame as a collection of Series. Now, if we want to add or truncate any Series or even a DataFrame, we need to use the truncate() function provided by the Pandas library. And this method of adding some data is known as truncating.
Truncate() Method
The Pandas library provides us a function named - truncate(), which helps us to add a Series or a DataFrame at any index of our choice.
Let us first see the syntax of the truncate() function.
Let us now learn about the various parameters taken by the truncate() function.
- before: It denotes the index value before which the rows are to be truncated. It is an optional parameter. It will remove everything stored before this value. It can use Number, Label, or Date as its values. The default value of this parameter is None.
- after: It denotes the index value after which the rows are to be truncated. It is an optional parameter. It will remove everything stored after this value. It can use Number, Label, or Date as its values. The default value of this parameter is None.
- axis: The axis parameter denotes the axis that must be truncated. By default, it truncated the index rows. It is an optional parameter. The default value of the axis is set to 0. It can have 01, 'index', and 'columns' as its values.
- copy: It depicts if we want to get a copy of the truncated section. It is an optional parameter. The default value of copy is set to True. It can have True or False values.
The Pandas truncate() function returns a Series or a DataFrame. If we will append a series to a series, then a series is returned, but if we append a series or DataFrame to another DataFrame, then a DataFrame is returned.
Usage & Example
Let us take a sample DataFrame and try to truncate some data using the truncate() function.
Output:
In the above example, we can see that since we have provided the before value as 3 and the after value as 5 so all the data before row 3 has been removed, and all the data after row 5 is removed as well.
Similarly, we can add any Series and DataFrame to the already present Series or DataFrame using the truncate() function.
How to Delete Rows and Columns in a DataFrame?
There is numerous situation in which we have a large set of data present in the DataFrame, and we do not want some rows or columns in such situations, we can use either the del keyword or the Pandas drop() method or the Pandas pop() method to delete the unwanted rows and columns.
Using Del Keyword
We can use the Pandas del keyword. We need to pass the name of the column or row that we want to delete with the del keyword. Let us see an example for more clarity.
Output:
Using Drop()
We know that the Pandas module is very efficient and provides a lot of features and tools to analyze and modify large data sets. The drop() method removes the row or column by providing the index label or the column name to the drop() method.
Let us first see the syntax of the drop() function.
Let us now learn about the various parameters taken by the drop() function.
- labels: The labels parameter is actually the list of strings or simply a string that refers to the name of the row and column that has to be deleted.
- axis: It is an integer type parameter where the axis value 0 means rows and axis value 1 means columns.
- index or column: This parameter is a single label or a list. The index or column parameter is used as an alternative to the axis parameter; hence both parameters are never used together.
- level: The level parameter is used to specify the level in case a DataFrame has a multilevel index.
- inplace: It is a boolean parameter, so if its value is true, then the changes are made to the original DataFrame. In case of False, a new DataFrame is returned.
- errors: This parameter is used to ignore the errors if the data frame does not contain the specified column.
The function returns the invoked DataFrame after dropping the specified values.
Let us see some examples for more clarity.
Deleting Columns
Let us remove a column from the DataFrame.
Output:
Deleting Rows
Let us remove a row from the DataFrame.
Output:
Conclusion
- Pandas is an open-source library that provides us with highly optimized data structures and data analysis tools. Pandas library is very fast and comes with a lot of handy tools, which makes it very useful in terms of high performance and productivity.
- If we want to add or truncate any Series or even a DataFrame, we can use the truncate() function provided by the Pandas library. And this method of adding some data is known as truncating. The Pandas truncate() function returns a Series or a DataFrame.
- If we do not want some rows or columns to be present in a DataFrame, we can use the del keyword or the Pandas drop() method or the Pandas pop() method to delete the unwanted rows and columns.
- The Pandas drop() function returns a DataFrame, either updating the invoked one or a new one depending upon the specified parameter.
- The parameters of the truncate() functions are: before, after, axis, and copy.
- The parameters of the drop() functions are labels, axis, index, level, inplace, and errors.