What is the Pandas melt() Function?
Before learning about the Pandas melt function, let us first get a brief introduction to the Pandas module.
An Introduction to the Pandas Library
Pandas library is an open-source (free to use) library that is built on top of another very useful Python library, i.e., NumPy library. Pandas library is widely used in the field of data science, machine learning, and data analytics as it simplifies data importing and data analysis. The prime reason for the Pandas package's popularity is its easy importing feature and easy data analyzing data feature. Pandas module is quite fast and comes in very handy because of its high performance and productivity.
Pandas melt() Function
Now a question comes to our mind what are Pandas melt functions?
Well, the Pandas melt() function is used for easy analysis of the data present in the tabular form. The Pandas melt() function converts or, rather, we can say, reshapes our tabular data so that it can be easily viewed and analyzed on our computer. If we want to un-pivot a DataFrame from the wide format to the long format, we can use the Pandas melt() function.
The Pandas melt() function messages the DataFrame into the kind of format where one or more than one column is treated as identifier variable(s). The rest of the columns are considered measured variables. The measured variables are un-pivoted to the row axis. The Pandas melt() function leaves two non-identifier columns as variable and value.
In simpler terms, we can say that the Pandas melt() function reshapes our DataFrame into a long table. The table contains one row for each column. Hence, a one-liner definition of the Pandas melt function can be - it is a function that is used to create a format of DataFrame object in which one or more than one column work as identifiers and the rest of the columns are treated as values.
Now, what can we do to get back the original DataFrame? Well, we can use the pivot() function of the Pandas module to get the original DataFrame.
Syntax
The syntax of the Pandas melt function is quite simple.
The syntax is:
Let us now learn about these various arguments present in the Pandas melt function in detail in the next section.
Parameters
Let us discuss the various parameters involved.
- frame: It is a required parameter, and it denotes the actual DataFrame.
- id_vars: It is an optional parameter that can be a tuple, list, or ndarray, and it is used as an identifier variable for the DataFrame's column. Its default value is None.
- value_vars: It is an optional parameter that can be a tuple, list, or ndarray and the columns to un-pivot. If we do not specify the value_vars parameter, then this parameter uses the non-id_vars columns. Its default value is None.
- var_name: It is an optional parameter, and it is used to specify the name to be used for the variable column. If its value is set to None then it uses the frame.columns.name or variable for the same. It can take String as a var_name.
- value_name: It is an optional parameter. The value_name parameter is used to specify the value column. Its default value is value.
- col_level: It is an optional parameter that can contain an integer or a string, and it is used when the columns are multi-indexed. So, in the case of multi-indexed columns, it uses the specified level to melt.
- ignore_index: It is also an optional parameter that can be True or False, and it is used to specify whether Pandas need to ignore the originally used index or not. Its default value is set to True.
Return Value
The Pandas melt function returns a reshaped DataFrame object of the provided DataFrame. One thing we should keep in mind is that the Pandas melt function does not change the original value present in the DataFrame.
Examples
Let us take some examples to understand the working of the Pandas melt function for more clarity.
Create a Simple DataFrame
Let us first create a simple DataFrame.
Output:
Let us now set the Name as id_vars and Section as value_vars.
Output:
Multiple Unpivot Columns
In the above example, we have set a single unpivot column, let us make a few more (multiple) pivot columns.
Output:
Skipping Columns in melt() Function
We can even skip a column and pass some different columns. Let us see how.
Output:
Unmelting DataFrame Using pivot() Function
Let us now use the pivot() function and un-melt the DataFrame. In the pivot() function, the index parameter value is the same as the id_vars value. We also need to pass the variable column's name.
Output:
Conclusion
- The Pandas melt() function is used for easy analysis of the data present in the tabular form. The Pandas melt() function converts or, rather, we can say, reshape our tabular data so that it can be easily viewed and analyzed on our computer.
- The Pandas melt() function messages the DataFrame into the kind of format where one or more than one column is treated as identifier variable(s).
- The Pandas melt() function reshapes our DataFrame into a long table. The table contains one row for each column.
- The Pandas melt function returns a reshaped DataFrame object of the provided DataFrame. One thing we should keep in mind is that the Pandas melt function does not change the original value present in the DataFrame.
- The pivot() function is used to un-melt the function. In the pivot() function, the index parameter value is the same as the id_vars value. We also need to pass the variable column's name.