In Pandas, the dot()
method performs matrix multiplication (or dot product) between two arrays, DataFrames, or Series. It is commonly applied in mathematical operations like vector or matrix multiplication.
In this article, I will explain the Pandas DataFrame dot()
method and using its syntax, parameters, and usage for performing matrix multiplication with various objects, such as a Series, another DataFrame, or a NumPy array. This method computes the matrix product and returns either a Series or a DataFrame, depending on the inputs provided.
Key Points –
- The
dot()
method is used to perform matrix multiplication between two DataFrames or between a DataFrame and another array-like object. - It can compute the dot product between a DataFrame and a Series or between two Series, often used for vector-based operations.
- For matrix multiplication between DataFrames, the number of columns in the first DataFrame must equal the number of rows in the second DataFrame.
- dot() is used for efficient linear algebra operations, especially for tasks like machine learning and solving systems of equations.
- Unlike arithmetic operations such as
*
,dot()
performs matrix multiplication, not element-wise multiplication.
Pandas DataFrame dot() Introduction
Following is the syntax of the Pandas DataFrame dot() method.
Syntax of Pandas dataframe dot() method
DataFrame.dot(other)
Parameters of the DataFrame dot()
It allows only one parameter.
other
– The other object (DataFrame, Series, or array-like) to perform the matrix multiplication or dot product with.
Return Value
It returns a scalar, Series, or DataFrame based on the dimensions of the input objects. The output represents the result of the dot product or matrix multiplication.
Usage of Pandas DataFrame dot() Method
The dot()
method in Pandas executes matrix multiplication (dot product) between DataFrames, Series, or arrays. It is useful for linear algebra operations and can be applied to both rows and columns of a DataFrame. The method computes the sum of the products of corresponding elements from the input objects.
To run some examples of the Pandas DataFrame dot() method, let’s create two Pandas DataFrame using data from a list.
# Create DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame([[3, 5], [4, 7]])
print("Create first DataFrame:\n",df)
df1 = pd.DataFrame([[6, 4], [8, 2]])
print("Create Second DataFrame:\n",df1)
Yields below output.
Matrix Multiplication Between Two DataFrames
You can use the dot()
method to perform matrix multiplication between two DataFrames. For this operation, verify that the number of columns in the first DataFrame matches the number of rows in the second DataFrame. Applying the dot()
function to both DataFrames will result in a new DataFrame where each element is the sum of the products of corresponding elements from the rows of the first DataFrame and the columns of the second DataFrame.
# Perform matrix multiplication
result = df.dot(df1)
print("Matrix multiplication result:\n", result)
Here,
- The element at position (0, 0) is 3 × 6 + 5 × 8 = 18 + 40 = 58
- The element at position (0, 1) is 3 × 4 + 5 × 2 = 12 + 10 = 22
- The element at position (1, 0) is 4 × 6 + 7 × 8 = 24 + 56 = 80
- The element at position (1, 1) is 4 × 4 + 7 × 2 = 16 + 14 = 30
Dot product between a DataFrame and a Series
To perform the dot product between a DataFrame and a Series, you use the dot()
method, where each row of the DataFrame is multiplied by the Series and then summed to produce a new Series.
import pandas as pd
import numpy as np
# Create the DataFrame
df = pd.DataFrame([[3, 5], [4, 7]])
# Create the Series
series = pd.Series([2, 1])
# Perform dot product
result = df.dot(series)
print("Dot product result:\n", result)
# Output:
# Dot product result:
# 0 11
# 1 15
# dtype: int64
Here, the dot product calculation
- For each row in the DataFrame, compute the dot product with the Series.
- For row
[3, 5]
: 3×2+5×1 = 6+5 = 11 - For row
[4, 7]
: 4×2+7×1 = 8+7 = 15
Dot Product between Two Series
To compute the dot product between two Series in Pandas, you use the dot()
method. This operation calculates the sum of the products of corresponding elements in the Series.
import pandas as pd
# Create two Series
ser = pd.Series([1, 2, 3])
ser1 = pd.Series([4, 5, 6])
# Perform dot product
result = ser.dot(ser1)
print("Dot product result:", result)
# Output:
# Dot product result: 32
Here,
- The dot product is calculated as (1×4) + (2×5) + (3×6) = 4 + 10+ 18 = 32.
Dot Product in a Larger Matrix Multiplication
Similarly, the dot()
method is used to carry out matrix multiplication between df1
and df2
. For matrix multiplication to be valid, the number of columns in the first matrix (df1
) must equal the number of rows in the second matrix (df2
). In this case, df1
has 3 columns and df2
has 3 rows, making the multiplication possible.
import pandas as pd
# Create two DataFrames
df1 = pd.DataFrame([[1, 2, 3], [4, 5, 6]])
df2 = pd.DataFrame([[7, 8], [9, 10], [11, 12]])
# Perform matrix multiplication
result = df1.dot(df2)
print("Matrix multiplication result:\n", result)
# Output:
# Matrix multiplication result:
# 0 1
# 0 58 64
# 1 139 154
Here,
- The element at position (0, 0) is calculated as (1×7)+(2×9)+(3×11)=7+18+33=58.
- The element at position (0, 1) is calculated as (1×8)+(2×10)+(3×12)=8+20+36=64.
- The element at position (1, 0) is calculated as (4×7)+(5×9)+(6×11)=28+45+66=139
- The element at position (1, 1) is calculated as (4×8)+(5×10)+(6×12)=32+50+72=154
Matrix Multiplication with Named Columns and Rows
Finally, to perform matrix multiplication with named columns and rows in Pandas, the dot()
method automatically aligns the columns and rows based on their labels, rather than their positions.
import pandas as pd
# Create the first DataFrame with named columns and rows
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}, index=['X', 'Y'])
print("First DataFrame:\n", df)
# Create the second DataFrame with corresponding named rows and columns
df1 = pd.DataFrame({'C': [5, 6], 'D': [7, 8]}, index=['A', 'B'])
print("Second DataFrame:\n", df1)
# Perform matrix multiplication using the dot() method
result = df.dot(df1)
print("Matrix multiplication result:\n", result)
# Output:
# First DataFrame:
# A B
# X 1 3
# Y 2 4
# Second DataFrame:
# C D
# A 5 7
# B 6 8
# Matrix multiplication result:
# C D
# X 23 31
# Y 34 46
Here,
- The result for row
X
is calculated as:- (1×5)+(3×6)=5+18=23 for column C
- (1×7)+(3×8)=7+24=31 for column D
- The result for row
Y
is calculated as:- (2×5)+(4×6)=10+24=34 for column C
- (2×7)+(4×8)=14+32=46 for column D
Frequently Asked Questions on Pandas DataFrame dot() Method
The pandas DataFrame.dot()
method is primarily used for matrix multiplication, which has a variety of applications across data science, machine learning, and linear algebra tasks.
As long as the number of columns in the first DataFrame matches the number of rows in the second DataFrame. The dot()
method requires alignment between the dimensions involved for the matrix multiplication to be valid.
When performing matrix multiplication on DataFrames with labeled columns and rows, the dot() method automatically aligns them based on their labels rather than their positional order. This ensures that the correct values are multiplied, even if the labels are in a different order.
If the number of columns in the DataFrame does not match the number of elements in the Series (or rows in another DataFrame), Pandas will raise a ValueError
indicating that the shapes are not aligned for matrix multiplication.
Conclusion
In conclusion, the pandas DataFrame.dot()
method is a powerful and efficient tool for performing matrix multiplication, a key operation in many mathematical, scientific, and machine-learning applications. It works with DataFrames, Series, and NumPy arrays.
Happy Learning!!
Related Articles
- Pandas DataFrame cumprod() Method
- Pandas DataFrame diff() Method
- Pandas DataFrame max() Function
- Pandas DataFrame any() Method
- Pandas DataFrame round() Method
- Pandas DataFrame min() Method
- Pandas DataFrame cov() Method
- Pandas DataFrame ffill() Method
- Pandas DataFrame eval() Function
- Pandas DataFrame bfill() Method
- Pandas DataFrame cumsum() Method
- Pandas DataFrame std() Method