How to convert the Pandas column to int in DataFrame? You can use DataFrame.astype(int)
or DataFrame.apply() method to convert a column to int (float/string to integer/int64/int32 dtype) data type. If you are converting float, you would know float is bigger than int type, and converting into int would lose any value after the decimal.
Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after). In this article, I will explain different ways to convert the column to an integer in a DataFrame.
Related: In Pandas, you can also convert column to string type.
Quick Examples of Convert Column to Int in DataFrame
Below are quick examples of converting the column to integer dtype in DataFrame.
# Quick examples of convert column to int in dataframe
# Example 1: Convert "Fee" from String to int
df = df.astype({'Fee':'int'})
# Example 2: Convert all columns to int dtype
# This returns error in our DataFrame
df = df.astype('int')
# Example 3: Convert single column to int dtype
df['Fee'] = df['Fee'].astype('int')
# Example 4: Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
# Example 5: Converting multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
# Example 6: Convert "Fee" from float
# To int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)
To run some examples of converting the column to integer dtype in Pandas DataFrame, let’s create Pandas DataFrame using data from a dictionary.
# Create DataFrame
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
'Fee' :["22000","25000","23000","24000","26000"],
'Duration':['30days','50days','35days', '40days','55days'],
'Discount':[1000.10,2300.15,1000.5,1200.22,2500.20]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
print("------------------------")
print("Get type of the columns:\n", df.dtypes)
Yields below output.
Convert Column to Int (Integer)
You can use pandas DataFrame.astype() function to convert column to int(integer). You can apply this to a specific column or to an entire DataFrame. To cast the data type to a 64-bit signed integer, you can use numpy.int64, numpy.int_, int64, or int as param. To cast to a 32-bit signed integer, use numpy.int32
or int32
.
The below example converts Fee
column from string dtype to int64
.
# Convert "Fee" from String to int
df = df.astype({'Fee':'int'})
print("After converting 'Fee' column into int:\n", df.dtypes)
Yields below output.
If you have a DataFrame with all string columns holding integer values, you can simply convert it to int dtype using as below. If you have any column that has alpha-numeric values, this returns an error. If you run this on our DataFrame, you will get an error.
# Convert all columns to int dtype.
df = df.astype('int')
print(df)
# Output:
# ValueError
Alternatively, to convert a single column to integer data type, you can use the astype()
function in pandas. You will access each column from the DataFrame as a pandas Series since every column in a DataFrame is a Series. Then, you will utilize the astype()
function to convert the data type of the specific column. In the example below, calling either df.Fee
or df['Fee']
returns a Series object.
# Convert single column to int dtype.
df['Fee'] = df['Fee'].astype('int')
Convert Float to Int dtype
Now by using the same approaches using astype()
let’s convert the float column to int (integer) type in pandas DataFrame. Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after).
The below example converts the column Discount
holding float values to int using DataFrame.astype()
function.
# Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
print(df.dtypes)
Yields below output
# Output:
Courses object
Fee int64
Duration object
Discount int64
dtype: object
Similarly, you can also cast all columns or a single column. Refer to examples in the above section for details.
Casting Multiple Columns to Int (Integer)
Using a dictionary with column names mapped to their respective data types is another efficient way to convert multiple columns to integers using the astype()
method in pandas. The following example converts the Fee
column from string to integer and the Discount
column from float to integer data types.
# Converting multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
print(df.dtypes)
# Converting multiple columns to integer data type
# using a dictionary
dtypes_dict = {"Fee": int, "Discount": int}
df = df.astype(dtypes_dict)
print(df.dtypes)
Yields below output.
# Output:
Courses object
Fee int32
Duration object
Discount int32
dtype: object
Using apply(np.int64) to Cast to Integer
You can also use DataFrame.apply() method to convert Fee
column from string to integer in pandas. As you see in this example we are using numpy.int64. Before going to use numpy functions we need to import numpy module.
import numpy as np
# Convert "Fee" from float to int using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
print(df.dtypes)
Yields below output.
# Output:
Courses object
Fee int64
Duration object
Discount float64
dtype: object
Convert Column Containing NaNs to astype(int)
In order to demonstrate some NaN/Null
values, let’s create a DataFrame using NaN Values. To convert a column that includes a mixture of float and NaN values to int, first, replace NaN values with zero on pandas DataFrame and then use astype()
to convert.
import pandas as pd
import numpy as np
technologies= {
'Fee' :[22000.30,25000.40,np.nan,24000.50,26000.10,np.nan]
}
df = pd.DataFrame(technologies)
print(df)
You can also use DataFrame.fillna() to replace the NaN values with the integer value zero.
# Convert "Fee" from float to int
# and replace NaN values with zeros
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)
Yields below output.
# Output:
Fee
0 22000
1 25000
2 0
3 24000
4 26000
5 0
Fee int32
dtype: object
Frequently Asked Questions of Pandas Convert Column to Int
You can convert a column to an integer in a Pandas DataFrame using the astype()
method or df.apply()
You should ensure that the column only contains values that can be safely converted to integers. If you have non-integer values, you’ll need to clean or preprocess the data before using astype(int)
.
Yes, you can specify the exact integer data type when using astype()
. For example, df['column_name'].astype('int32')
will convert to int32.
To convert multiple columns to integers, you can apply the conversion methods to each column individually.
Conclusion
In this article, I have explained how to convert column string to int and float to int using DataFrame.astype()
and DataFrame.apply()
methods. Also, You’ve learned to convert float and string to integers when you have Nan/null values in a column.
Happy Learning !!
Related Articles
- Convert String to Float in pandas DataFrame?
- How to Convert Index to Column in pandas DataFrame?
- How to Replace Nan/Null to Empty String in pandas?
- How to Convert List of Dictionaries to DataFrame?
- Pandas Convert String to Integer
- Pandas Convert Float to Integer in DataFrame
- Pandas Convert Column to String Type?
- How to Convert Pandas Uppercase Column
- pandas convert column to numpy array
- Pandas Convert Row to Column Header in DataFrame