• Post author:
  • Post category:Pandas
  • Post last modified:May 22, 2024
  • Reading time:15 mins read
You are currently viewing Pandas Convert Column to Int in DataFrame

How to convert the Pandas column to int in DataFrame? You can use DataFrame.astype(int) or DataFrame.apply() method to convert a column to int (float/string to integer/int64/int32 dtype) data type. If you are converting float, you would know float is bigger than int type, and converting into int would lose any value after the decimal.

Advertisements

Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after). In this article, I will explain different ways to convert the column to an integer in a DataFrame.

Related: In Pandas, you can also convert column to string type.

Quick Examples of Convert Column to Int in DataFrame

Below are quick examples of converting the column to integer dtype in DataFrame.


# Quick examples of convert column to int in dataframe

# Example 1: Convert "Fee" from String to int
df = df.astype({'Fee':'int'})

# Example 2: Convert all columns to int dtype
# This returns error in our DataFrame
df = df.astype('int')

# Example 3: Convert single column to int dtype
df['Fee'] = df['Fee'].astype('int')

# Example 4: Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})

# Example 5: Converting multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})

# Example 6: Convert "Fee" from float 
# To int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)

To run some examples of converting the column to integer dtype in Pandas DataFrame, let’s create Pandas DataFrame using data from a dictionary.


# Create DataFrame
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
    'Fee' :["22000","25000","23000","24000","26000"],
    'Duration':['30days','50days','35days', '40days','55days'],
    'Discount':[1000.10,2300.15,1000.5,1200.22,2500.20]
          }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
print("------------------------")
print("Get type of the columns:\n", df.dtypes)

Yields below output.

Pandas Convert int

Convert Column to Int (Integer)

You can use pandas DataFrame.astype() function to convert column to int(integer). You can apply this to a specific column or to an entire DataFrame. To cast the data type to a 64-bit signed integer, you can use numpy.int64, numpy.int_, int64, or int as param. To cast to a 32-bit signed integer, use numpy.int32 or int32.

The below example converts Fee column from string dtype to int64.


# Convert "Fee" from String to int
df = df.astype({'Fee':'int'})
print("After converting 'Fee' column into int:\n", df.dtypes)

Yields below output.

Pandas Convert int

If you have a DataFrame with all string columns holding integer values, you can simply convert it to int dtype using as below. If you have any column that has alpha-numeric values, this returns an error. If you run this on our DataFrame, you will get an error.


# Convert all columns to int dtype.
df = df.astype('int')
print(df)

# Output:
# ValueError

Alternatively, to convert a single column to integer data type, you can use the astype() function in pandas. You will access each column from the DataFrame as a pandas Series since every column in a DataFrame is a Series. Then, you will utilize the astype() function to convert the data type of the specific column. In the example below, calling either df.Fee or df['Fee'] returns a Series object.


# Convert single column to int dtype.
df['Fee'] = df['Fee'].astype('int')

Convert Float to Int dtype

Now by using the same approaches using astype() let’s convert the float column to int (integer) type in pandas DataFrame. Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after).

The below example converts the column Discount holding float values to int using DataFrame.astype() function.


# Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
print(df.dtypes)

Yields below output


# Output:
Courses     object
Fee          int64
Duration    object
Discount     int64
dtype: object

Similarly, you can also cast all columns or a single column. Refer to examples in the above section for details.

Casting Multiple Columns to Int (Integer)

Using a dictionary with column names mapped to their respective data types is another efficient way to convert multiple columns to integers using the astype() method in pandas. The following example converts the Fee column from string to integer and the Discount column from float to integer data types.


# Converting multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
print(df.dtypes)

# Converting multiple columns to integer data type 
# using a dictionary
dtypes_dict = {"Fee": int, "Discount": int}
df = df.astype(dtypes_dict)
print(df.dtypes)

Yields below output.


# Output:
Courses     object
Fee          int32
Duration    object
Discount     int32
dtype: object

Using apply(np.int64) to Cast to Integer

You can also use DataFrame.apply() method to convert Fee column from string to integer in pandas. As you see in this example we are using numpy.int64. Before going to use numpy functions we need to import numpy module.


import numpy as np
# Convert "Fee" from float to int using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
print(df.dtypes)

Yields below output.


# Output:
Courses      object
Fee           int64
Duration     object
Discount    float64
dtype: object

Convert Column Containing NaNs to astype(int)

In order to demonstrate some NaN/Null values, let’s create a DataFrame using NaN Values. To convert a column that includes a mixture of float and NaN values to int, first, replace NaN values with zero on pandas DataFrame and then use astype() to convert.


import pandas as pd
import numpy as np
technologies= {
    'Fee' :[22000.30,25000.40,np.nan,24000.50,26000.10,np.nan]
          }
df = pd.DataFrame(technologies)
print(df)

You can also use DataFrame.fillna() to replace the NaN values with the integer value zero.


# Convert "Fee" from float to int 
# and replace NaN values with zeros
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)

Yields below output.


# Output:
     Fee
0  22000
1  25000
2      0
3  24000
4  26000
5      0
Fee    int32
dtype: object

Frequently Asked Questions of Pandas Convert Column to Int

How to convert a column to an integer in a Pandas DataFrame?

You can convert a column to an integer in a Pandas DataFrame using the astype() method or df.apply()

How do I handle non-integer values when using astype(int)?

You should ensure that the column only contains values that can be safely converted to integers. If you have non-integer values, you’ll need to clean or preprocess the data before using astype(int).

Can I specify the integer data type (e.g., int32 or int64) when using astype(int)?

Yes, you can specify the exact integer data type when using astype(). For example, df['column_name'].astype('int32') will convert to int32.

How can I convert multiple columns to integers in a Pandas DataFrame?

To convert multiple columns to integers, you can apply the conversion methods to each column individually.

Conclusion

In this article, I have explained how to convert column string to int and float to int using DataFrame.astype() and DataFrame.apply() methods. Also, You’ve learned to convert float and string to integers when you have Nan/null values in a column.

Happy Learning !!

References