How to Calculate Correlation in NumPy?

Learn via video courses
Topics Covered

We can execute complex and computationally intensive computations with the help of Python's incredibly strong NumPy package. Several functions enable us to control and manipulate arrays and use them as necessary. One such function is called the NumPy corrcoeff() function.

The NumPy corrcoeff() function is used to calculate the numpy correlation coefficient between two one-dimensional data points. The relationship between the specified features of the dataset is indicated by the correlation coefficient, which is a numerical value.

When two features are positively correlated, it means that they are directly related and that a rise in one would result in an increase in the other. A negative correlation is also possible, indicating that the two qualities are inversely related to one another, such that an increase in one would cause a decrease in the other.

Syntax

The syntax for the NumPy corrcoef() function to find the numpy correlation coefficient is:

Parameters

The parameters that NumPy corrcoef() takes in are:

  • arr1: This mandatory parameter represents the sequence of the first input array to find the numpy correlation.
  • arr2: This mandatory parameter represents the sequence of the second input array to find the numpy correlation.
  • mode: This optional parameter represents the convolution that is supposed to happen during the calculation of coefficient variables. It has three modes; valid, same, and full. By default, this parameter is set as valid.

Return Type

The NumPy corrcoeff() function returns a NumPy array, which consists of the numpy correlation between the arrays given through the input parameters.

Examples

  • Use of NumPy corrcoeff() in NumPy

There are three types of correlations in Mathematics; Pearson, Kendall and Spearman. The NumPy coeffcorr() function computes the Pearson's correlation only. In this example, we will compute the numpy correlation coefficient of an array using the NumPy corrcoeff() function:

Output

  • Computing correlation using NumPy correlate()

Another alternative to NumPy corrcoeff() is the NumPy correlate(), which helps us to find the cross-correlation of two one-dimensional arrays. This is widely used in signal processing.

Output

  • Using Correlation with Matplotlib and Making Correlation Graphs

NumPy Correlation is understood in a much better sense when we visualize it. Two arrays can have a positive numpy correlation when one influences the other (directly proportional), or have a negative numpy correlation (inversely proportional) with one another.

Using the matplotlib library, we'll visualize the numpy correlation and find out if it is a positive or negative numpy correlation.

Output

Using Correlation with Matplotlib and making Correlation Graphs

As we can see, the graph has an upwards trajectory. This means that we have a positive numpy correlation between our data points.

Conclusion

  • In this article, we learned about NumPy corrcoeff(); a function used to calculate the numpy correlation between two sets of one-dimensional data points.
  • There are three types of correlation; Pearson, Spearman and Kendall. NumPy corrcoeff() supports Pearson Correlation.
  • We also looked at NumPy correlate(), another function that gives us the numpy correlation between two one-dimensional data points.
  • To further cement our knowledge about NumPy corrcoeff(), we looked at various examples in which we use NumPy corrcoeff().