How to convert the Pandas column to int in DataFrame? You can use DataFrame.astype(int)
or DataFrame.apply() method to convert a column to int (float/string to integer/int64/int32 dtype) data type. If you are converting float, you would know float is bigger than int type, and converting into int would lose any value after the decimal.
Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after). In this article, I will explain different ways to convert the column to an integer in a DataFrame.
Key Points –
- Commonly used to convert a column to integer with
astype(int)
or by specifying the column type. - Ensure the column contains only numeric values or values that can be converted to integers before conversion.
- This function offers more flexibility than
astype()
, handling errors more gracefully with theerrors
parameter. - You can downcast to smaller integer types (e.g.,
int8
orint16
) to reduce memory usage in large datasets. - After converting a column, verify the conversion by checking the column’s data type with
dtypes
orinfo()
. - To save memory, you can use the downcast parameter in
pd.to_numeric()
to convert the column to a smaller integer type likeint8
orint16
.
Related: In Pandas, you can also convert column to string type.
Quick Examples of Convert Column to Int in DataFrame
Below are quick examples of converting the column to integer dtype in DataFrame.
# Quick examples of convert column to int in dataframe
# Example 1: Convert "Fee" from String to int
df = df.astype({'Fee':'int'})
# Example 2: Convert all columns to int dtype
# This returns error in our DataFrame
df = df.astype('int')
# Example 3: Convert single column to int dtype
df['Fee'] = df['Fee'].astype('int')
# Example 4: Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
# Example 5: Converting multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
# Example 6: Convert "Fee" from float
# To int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df.dtypes)
To run some examples of converting the column to integer dtype in Pandas DataFrame, let’s create Pandas DataFrame using data from a dictionary.
# Create DataFrame
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
'Fee' :["22000","25000","23000","24000","26000"],
'Duration':['30days','50days','35days', '40days','55days'],
'Discount':[1000.10,2300.15,1000.5,1200.22,2500.20]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
print("------------------------")
print("Get type of the columns:\n", df.dtypes)
Yields below output.
Convert Column to Int (Integer)
You can use pandas DataFrame.astype() function to convert column to int(integer). You can apply this to a specific column or to an entire DataFrame. To cast the data type to a 64-bit signed integer, you can use numpy.int64, numpy.int_, int64, or int as param. To cast to a 32-bit signed integer, use numpy.int32
or int32
.
The below example converts Fee
column from string dtype to int64
.
# Convert "Fee" from String to int
df = df.astype({'Fee':'int'})
print("After converting 'Fee' column into int:\n", df.dtypes)
Yields below output.
If you have a DataFrame with all string columns holding integer values, you can simply convert it to int dtype using as below. If you have any column that has alpha-numeric values, this returns an error. If you run this on our DataFrame, you will get an error.
# Convert all columns to int dtype.
df = df.astype('int')
print(df)
# Output:
# ValueError
Alternatively, to convert a single column to integer data type, you can use the astype()
function in pandas. You will access each column from the DataFrame as a pandas Series since every column in a DataFrame is a Series. Then, you will utilize the astype()
function to convert the data type of the specific column. In the example below, calling either df.Fee
or df['Fee']
returns a Series object.
# Convert single column to int dtype.
df['Fee'] = df['Fee'].astype('int')
Convert Float to Int dtype
Now by using the same approaches using astype()
let’s convert the float column to int (integer) type in pandas DataFrame. Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after).
The below example converts the column Discount
holding float values to int using DataFrame.astype()
function.
# Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
print(df.dtypes)
Yields below output
# Output:
Courses object
Fee int64
Duration object
Discount int64
dtype: object
Similarly, you can also cast all columns or a single column. Refer to examples in the above section for details.
Casting Multiple Columns to Int (Integer)
Using a dictionary with column names mapped to their respective data types is another efficient way to convert multiple columns to integers using the astype()
method in pandas. The following example converts the Fee
column from string to integer and the Discount
column from float to integer data types.
# Converting multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
print(df.dtypes)
# Converting multiple columns to integer data type
# using a dictionary
dtypes_dict = {"Fee": int, "Discount": int}
df = df.astype(dtypes_dict)
print(df.dtypes)
Yields below output.
# Output:
Courses object
Fee int32
Duration object
Discount int32
dtype: object
Using apply(np.int64) to Cast to Integer
You can also use DataFrame.apply() method to convert Fee
column from string to integer in pandas. As you see in this example we are using numpy.int64. Before going to use numpy functions we need to import numpy module.
import numpy as np
# Convert "Fee" from float to int
# Using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
print(df.dtypes)
Yields below output.
# Output:
Courses object
Fee int64
Duration object
Discount float64
dtype: object
Convert Column Containing NaNs to astype(int)
In order to demonstrate some NaN/Null
values, let’s create a DataFrame using NaN Values. To convert a column that includes a mixture of float and NaN values to int, first, replace NaN values with zero on pandas DataFrame and then use astype()
to convert.
import pandas as pd
import numpy as np
technologies= {
'Fee' :[22000.30,25000.40,np.nan,24000.50,26000.10,np.nan]
}
df = pd.DataFrame(technologies)
print(df)
You can also use DataFrame.fillna() to replace the NaN values with the integer value zero.
# Convert "Fee" from float to int
# and replace NaN values with zeros
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)
Yields below output.
# Output:
Fee
0 22000
1 25000
2 0
3 24000
4 26000
5 0
Fee int32
dtype: object
Frequently Asked Questions of Pandas Convert Column to Int
You can convert a column to an integer in a Pandas DataFrame using the astype()
method or df.apply()
You should ensure that the column only contains values that can be safely converted to integers. If you have non-integer values, you’ll need to clean or preprocess the data before using astype(int)
.
Yes, you can specify the exact integer data type when using astype()
. For example, df['column_name'].astype('int32')
will convert to int32.
To convert multiple columns to integers, you can apply the conversion methods to each column individually.
Conclusion
In this article, I have explained how to convert column string to int and float to int using DataFrame.astype()
and DataFrame.apply()
methods. Also, You’ve learned to convert float and string to integers when you have Nan/null values in a column.
Happy Learning !!
Related Articles
- Pandas Convert String to Integer
- pandas convert column to numpy array
- Pandas Convert Column to String Type?
- How to Convert Pandas Uppercase Column
- Convert String to Float in pandas DataFrame?
- Pandas Convert Float to Integer in DataFrame
- How to Convert List of Dictionaries to DataFrame?
- Pandas Convert Row to Column Header in DataFrame
- How to Replace Nan/Null to Empty String in Pandas?
- How to Convert Index to Column in Pandas DataFrame?