In pandas, the DataFrame `corrwith()`

method is used to compute the pairwise correlation between rows or columns of two DataFrame objects. This method can be particularly useful when you want to compare the similarity of two datasets by measuring the correlation of their corresponding rows or columns.

In this article, I will explain the Pandas DataFrame `corrwith()`

method by using its syntax, parameters, and usage, and how it returns a Series containing the correlation coefficients. This makes it easy to interpret the degree of correlation between corresponding columns or rows of the input DataFrames.

**Key Points –**

- The
`corrwith()`

method computes the pairwise correlation between rows or columns of two DataFrame objects, returning a Series of correlation coefficients. - The
`drop`

parameter can be set to`True`

to exclude labels with missing data from both DataFrames before computing the correlation. `corrwith()`

can be used to compute the correlation of each column or row in a DataFrame with a given Series, offering flexibility in comparing datasets.- When working with large datasets, be mindful of performance, as computing correlations can be computationally intensive, especially with the
`kendall`

and`spearman`

methods.

## Pandas DataFrame corrwith() Introduction

Following is the syntax of the Pandas DataFrame corrwith()

```
# Syntax of Pandas DataFrame corrwith()
DataFrame.corrwith(other, axis=0, drop=False, method='pearson')
```

### Parameters of the DataFrame corrwith()

Following are the parameters of the DataFrame corrwith() function.

`other`

– DataFrame or Series. The object to compute the correlation with.`axis`

– {0 or ‘index’, 1 or ‘columns’}, default 0- If
`0`

or`'index'`

, compute the correlation column-wise. - If
`1`

or`'columns'`

, compute the correlation row-wise.

- If
`drop`

– bool, default False. If`True`

, drop labels with missing data in both objects before computing the correlation.`method`

– {‘pearson’, ‘kendall’, ‘spearman’}, default ‘pearson’`pearson`

– Standard correlation coefficient.`kendall`

– Kendall Tau correlation coefficient.`spearman`

– Spearman rank correlation.

### Return Value

It returns Series: Correlation coefficients.

## Usage of Pandas DataFrame corrwith() Method

The `pandas.DataFrame.corrwith()`

function is used to compute pairwise correlation between rows or columns of two DataFrame objects or between a DataFrame and a Series. This function is useful in various scenarios, such as data analysis, feature selection, and anomaly detection.

To run some examples of pandas DataFrame corrwith() function, let’s create two Pandas DataFrames using data from Python dictionaries, with columns `A`

, `B`

, and `C`

.

```
# Create DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[5, 10, 15, 20], 'B': [2, 4, 6, 8], 'C': [3, 5, 7, 9]})
print("Create first DataFrame:\n",df)
df1 = pd.DataFrame({'A':[2, 4, 6, 8], 'B': [5, 7, 9, 11], 'C': [15, 3, 12, 8]})
print("Create Second DataFrame:\n",df1)
```

Yields below output.

To compute the correlation between corresponding columns of two DataFrames, you can use the `corrwith()`

method.

```
# Correlation between corresponding columns
column_correlation = df.corrwith(df1)
print("Column-wise correlation:\n", column_correlation)
```

In the above examples, the `corrwith()`

method will return a Series with the correlation coefficients for each corresponding column in `df`

and `df1`

. This example yields the below output.

## Using Row-wise Correlation

Alternatively, to calculate the row-wise correlation between two DataFrames in pandas, you can use the `corrwith()`

method with `axis=1`

. This approach computes the correlation between corresponding rows across the DataFrames.

```
# Compute row-wise correlation
row_correlation = df.corrwith(df1, axis=1)
print("Row-wise correlation:\n", row_correlation)
# Output:
# Row-wise correlation:
# 0 -0.400732
# 1 -0.423415
# 2 -0.810885
# 3 -0.563621
# dtype: float64
```

Here,

`df.corrwith(df1, axis=1)`

calculates the correlation between corresponding rows of`df`

and`df1`

.- The result,
`row_correlation`

, is a Series where each value represents the correlation coefficient between the corresponding rows of`df`

and`df1`

.

## Using Kendall Tau Correlation

To compute the Kendall Tau correlation between corresponding rows or columns of two DataFrames in pandas, you can specify the `method=kendall`

parameter in the `corrwith()`

method. Kendall Tau correlation is a measure of ordinal association between two measured quantities.

```
# Compute row-wise Kendall Tau correlation
kendall_correlation = df.corrwith(df1, axis=1, method='kendall')
print("Row-wise Kendall Tau correlation:\n", kendall_correlation)
# Output:
# Row-wise Kendall Tau correlation:
# 0 -0.333333
# 1 -0.333333
# 2 -0.333333
# 3 -0.816497
# dtype: float64
```

Here,

`df.corrwith(df1, axis=1, method='kendall')`

calculates the Kendall Tau correlation between corresponding rows of`df`

and`df1`

.- The
`method='kendall'`

parameter specifies that Kendall Tau correlation should be used. - The result,
`kendall_correlation`

, is a Series where each value represents the Kendall Tau correlation coefficient between the corresponding rows of`df`

and`df1`

.

## Spearman Rank Correlation

To calculate the Spearman rank correlation between corresponding rows or columns of two DataFrames in pandas, you can use the `method=spearman`

parameter in the `corrwith()`

method. Spearman correlation evaluates the monotonic relationship between two variables, which is based on the ranks of the data rather than the raw data values.

```
# Compute column-wise Spearman rank correlation
spearman_correlation = df.corrwith(df1, method='spearman')
print("Column-wise Spearman rank correlation:\n", spearman_correlation)
# Output:
# Column-wise Spearman rank correlation:
# A 1.0
# B 1.0
# C -0.4
# dtype: float64
```

Here,

`df.corrwith(df1, method='spearman')`

calculates the Spearman rank correlation between corresponding columns of`df`

and`df1`

.- The
`method='spearman'`

parameter specifies that Spearman rank correlation should be used. - The result,
`spearman_correlation`

, is a Series where each value represents the Spearman rank correlation coefficient between the corresponding columns of`df`

and`df1`

.

## Correlation with a Series

Similarly, to compute the correlation between each column of a DataFrame and a Series in pandas, you can use the `corrwith()`

method. This allows you to assess how each column in the DataFrame relates to the values in the Series.

```
# Create DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[5, 10, 15, 20], 'B': [2, 4, 6, 8], 'C': [3, 5, 7, 9]})
# Create a Series
ser = pd.Series([3, 7, 12, 8])
# Compute correlation with Series
series_correlation = df.corrwith(ser, axis=0)
print("Correlation with Series:\n", series_correlation)
# Output:
# Correlation with Series:
# A 0.69843
# B 0.69843
# C 0.69843
# dtype: float64
```

Here,

`df.corrwith(ser, axis=0)`

computes the correlation between each column of the DataFrame`df`

and the Series`ser`

.- The
`axis=0`

parameter specifies that the correlation should be computed column-wise. `ser`

is a Series with values`[3, 7, 12, 8]`

.- The result,
`series_correlation`

, is a Series where each value represents the correlation coefficient between the corresponding column of`df`

and the Series`ser`

.

## Dropping Labels with Missing Data

Finally, when using the `corrwith()`

method in pandas, you have the option to drop labels (rows or columns) that contain missing data (NaN values). This is controlled by the `drop`

parameter.

```
import pandas as pd
import numpy as np
# Create a DataFrame with missing values
df = pd.DataFrame({'A': [5, 10, np.nan, 20], 'B': [2, 4, 6, 8], 'C': [3, np.nan, 7, 9]})
# Create another DataFrame
df1 = pd.DataFrame({'A': [2, 4, 6, 8], 'B': [5, 7, 9, 11], 'C': [15, 3, 12, 8]})
# Compute column-wise correlation with dropping NaN labels
df2 = df.corrwith(df1, drop=True)
print("Column-wise correlation with dropping NaN labels:\n", df2)
# Output:
# Column-wise correlation with dropping NaN labels:
# A 1.000000
# B 1.000000
# C -0.963123
# dtype: float64
```

Here,

`df`

is a DataFrame with missing values (`np.nan`

).`df1`

is another DataFrame without missing values.`df.corrwith(df1, drop=True)`

computes the column-wise correlation between`df`

and`df1`

, while dropping labels (columns) in`df`

that contain missing values.- The
`drop=True`

parameter ensures that columns in both`df`

and`df1`

with NaN values are excluded from the correlation calculation. - The result,
`df2`

, is a Series where each value represents the correlation coefficient between the corresponding columns of`df`

and`df1`

, after dropping NaN-labeled columns in`df`

.

## Frequently Asked Questions on Pandas DataFrame corrwith() Method

**What does the corrwith() method do in Pandas?**

The `corrwith()`

method computes the correlation coefficients between corresponding columns or rows of two DataFrames or between a DataFrame and a Series.

**How do you use corrwith() in Pandas?**

You can use `corrwith()`

by calling it on a DataFrame and passing another DataFrame or Series as an argument. It calculates correlations either column-wise (`axis=0`

) or row-wise (`axis=1`

) based on your choice.

**How does drop=True work in corrwith()?**

Setting `drop=True`

in `corrwith()`

excludes labels (rows or columns) with missing data (NaN values) from both DataFrames before computing the correlation.

**Can corrwith() handle missing data?**

`corrwith()`

can handle missing data by optionally dropping labels (`drop=True`

) with NaN values before calculating correlations.

**What does the return value of corrwith() represent?**

The return value of `corrwith()`

is a Series containing the correlation coefficients between corresponding columns or rows of the input DataFrames or between a DataFrame and a Series.

## Conclusion

In this article, I have explained the Pandas DataFrame `corrwith()`

function by using its syntax, parameters, usage, and how it returns a Series. This Series contains the correlation coefficients between the corresponding columns or rows of the input DataFrames or between the DataFrame and the Series.

Happy Learning!!

## Related Articles

- Pandas DataFrame tail() Method
- Pandas DataFrame pivot() Method
- Pandas DataFrame equals() Method
- Pandas DataFrame sum() Method
- Pandas DataFrame shift() Function
- Pandas DataFrame info() Function
- Pandas DataFrame head() Method
- Pandas DataFrame sample() Function
- Pandas DataFrame describe() Method
- Pandas DataFrame explode() Method