• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:15 mins read
You are currently viewing Pandas Convert Column to Int in DataFrame

How to convert the Pandas column to int in DataFrame? You can use DataFrame.astype(int) or DataFrame.apply() method to convert a column to int (float/string to integer/int64/int32 dtype) data type. If you are converting float, you would know float is bigger than int type, and converting into int would lose any value after the decimal.

Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after). In this article, I will explain different ways to convert the column to an integer in a DataFrame.

Related: In Pandas, you can also convert column to string type.

1. Quick Examples of Converting Pandas Column to Int

If you are in a hurry, below are quick examples of converting the column to integer dtype in DataFrame.


# Below are the quick examples

# Example 1: Convert "Fee" from String to int
df = df.astype({'Fee':'int'})

# Example 2: Convert all columns to int dtype.
# This returns error in our DataFrame
df = df.astype('int')

# Example 3: Convert single column to int dtype.
df['Fee'] = df['Fee'].astype('int')

# Example 4: Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})

# Example 5: Converting Multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})

# Example 6: Convert "Fee" from float to int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)

Now, let’s create a DataFrame with a few rows and columns, execute some examples, and validate the results. Our DataFrame contains column names Courses, Fee, Duration and Discount.


# Create DataFrame
import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
    'Fee' :["22000","25000","23000","24000","26000"],
    'Duration':['30days','50days','35days', '40days','55days'],
    'Discount':[1000.10,2300.15,1000.5,1200.22,2500.20]
          }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
print("------------------------")
print("Get type of the columns:\n", df.dtypes)

Yields below output.

Pandas Convert int

2. Pandas Convert Column to Int (Integer)

You can use pandas DataFrame.astype() function to convert column to int(integer). You can apply this to a specific column or to an entire DataFrame. To cast the data type to a 64-bit signed integer, you can use numpy.int64, numpy.int_, int64, or int as param. To cast to a 32-bit signed integer, use numpy.int32 or int32.

The below example converts Fee column from string dtype to int64.


# Convert "Fee" from String to int
df = df.astype({'Fee':'int'})
print("After converting 'Fee' column into int:\n", df.dtypes)

Yields below output.

Pandas Convert int

If you have a DataFrame with all string columns holding integer values, you can simply convert it to int dtype using as below. If you have any column that has alpha-numeric values, this returns an error. If you run this on our DataFrame, you will get an error.


# Convert all columns to int dtype.
df = df.astype('int')
print(df)

# Output:
# ValueError

Alternatively, you can use Series.astype() function to convert a specific column into an int type. Since each column on DataFrame is a pandas Series, I will get the column from DataFrame as a Series and use astype() function. In the below example df.Fee or df['Fee'] returns Series object.


# Convert single column to int dtype.
df['Fee'] = df['Fee'].astype('int')

3. Convert Float to Int dtype

Now by using the same approaches using astype() let’s convert the float column to int (integer) type in pandas DataFrame. Note that while converting a float to int, it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after).

The below example converts the column Discount holding float values to int using DataFrame.astype() function.


# Convert "Discount" from Float to int
df = df.astype({'Discount':'int'})
print(df.dtypes)

Yields below output


# Output:
Courses     object
Fee          int64
Duration    object
Discount     int64
dtype: object

Similarly, you can also cast all columns or a single column. Refer to examples in the above section for details.

4. Casting Multiple Columns to Integer

You can also convert multiple columns to integers by sending a dict of column name -> data type to astype() method. The below example converts the column Fee from String to int and Discount from float to int dtypes.


# Converting Multiple columns to int
df = pd.DataFrame(technologies)
df = df.astype({"Fee":"int","Discount":"int"})
print(df.dtypes)

Yields below output.


# Output:
Courses     object
Fee          int32
Duration    object
Discount     int32
dtype: object

5. Using apply(np.int64) to Cast to Integer

You can also use DataFrame.apply() method to convert Fee column from string to integer in pandas. As you see in this example we are using numpy.int64. Before going to use numpy functions we need to import numpy module.


import numpy as np
# Convert "Fee" from float to int using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
print(df.dtypes)

Yields below output.


# Output:
Courses      object
Fee           int64
Duration     object
Discount    float64
dtype: object

6. Convert Column Containing NaNs to astype(int)

In order to demonstrate some NaN/Null values, let’s create a DataFrame using NaN Values. To convert a column that includes a mixture of float and NaN values to int, first, replace NaN values with zero on pandas DataFrame and then use astype() to convert.


import pandas as pd
import numpy as np
technologies= {
    'Fee' :[22000.30,25000.40,np.nan,24000.50,26000.10,np.nan]
          }
df = pd.DataFrame(technologies)
print(df)

Use DataFrame.fillna() to replace the NaN values with the integer value zero.


# Convert "Fee" from float to int 
# and replace NaN values with zeros
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)

Yields below output.


# Output:
     Fee
0  22000
1  25000
2      0
3  24000
4  26000
5      0
Fee    int32
dtype: object

Frequently Asked Questions of Pandas Convert Column to Int

How to convert a column to an integer in a Pandas DataFrame?

You can convert a column to an integer in a Pandas DataFrame using the astype() method or df.apply()

How do I handle non-integer values when using astype(int)?

You should ensure that the column only contains values that can be safely converted to integers. If you have non-integer values, you’ll need to clean or preprocess the data before using astype(int).

Can I specify the integer data type (e.g., int32 or int64) when using astype(int)?

Yes, you can specify the exact integer data type when using astype(). For example, df['column_name'].astype('int32') will convert to int32.

How can I convert multiple columns to integers in a Pandas DataFrame?

To convert multiple columns to integers, you can apply the conversion methods to each column individually.

Conclusion

In this article, you have learned how to convert column string to int and float to int using DataFrame.astype() and DataFrame.apply() method. Also, you have learned how to convert float and string to integers when you have Nan/null values in a column.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium