Pandas – Convert Float to Integer in DataFrame

Use pandas.DataFrame.astype(int) and pandas.DataFrame.apply(np.int64) methods to convert/cast float column to integer(int/int64) type. I believe you would know float is bigger than int type, so you can easily downcase but the catch is you would lose any value after the decimal. Note that while casting it doesn’t do any rounding and flooring and it just truncates the fraction values (anything after .). In this article, I will explain different ways to convert columns with float values to integer values.

1. Quick Examples of Convert Float to Integer in pandas

If you are in a hurry, below are some of the quick examples of how to convert float to integer type in Pandas DataFrame.


# Below are quick examples
# converting "Fee" from float to int using DataFrame.astype()
df["Fee"]=df["Fee"].astype(int)
print(df.dtypes)

# converting "Fee" and "Discount" from float to int using DataFrame.astype()
df = df.astype({"Fee":"int","Discount":"int"})
print(df.dtypes)

# convert "Fee" from float to int using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
print(df.dtypes)

# converting "Fee" and "Discount" from float to int using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
df["Discount"] = df["Discount"].apply(np.int64)

# convert "Fee" from float to int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)

Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the results. Our DataFrame contains column names Courses, Fee, Duration and Discount.


import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
    'Fee' :[22000.30,25000.40,23000.20,24000.50,26000.10],
    'Duration':['30days','50days','35days', '40days','55days'],
    'Discount':[1000.10,2300.15,1000.5,1200.22,2500.20]
          }
df = pd.DataFrame(technologies)
print(df)
print(df.dttypes)

Yields below output.


   Courses      Fee Duration  Discount
0    Spark  22000.3   30days   1000.10
1  PySpark  25000.4   50days   2300.15
2   Hadoop  23000.2   35days   1000.50
3   Python  24000.5   40days   1200.22
4   Pandas  26000.1   55days   2500.20

Courses      object
Fee         float64
Duration     object
Discount    float64
dtype: object

2. Using pandas astype(int) to Convert Float to Integer (Int)

In order to convert flat column to integer column use DataFrame.astype() method, you can apply this on a specific column. Below example converts Fee column to int32 from float64. You can also use numpy.dtype as a param to this method.


# convert "Fee" from float to int
df["Fee"]=df["Fee"].astype(int)
print(df)
print(df.dtypes)

Yields below output.


   Courses    Fee Duration  Discount
0    Spark  22000   30days   1000.10
1  PySpark  25000   50days   2300.15
2   Hadoop  23000   35days   1000.50
3   Python  24000   40days   1200.22
4   Pandas  26000   55days   2500.20

Courses      object
Fee           int32
Duration     object
Discount    float64
dtype: object

3. Casting Multiple Columns From Float to Integer

Similarly, you can also convert multiple columns from float to integer by sending dict of column name -> data type to astype() method. The below example converts both columns Fee and Discount to int types.


# converting "Fee" and "Discount" from float to int
df = df.astype({"Fee":"int","Discount":"int"})
print(df.dtypes)

Yields below output.


Courses     object
Fee          int32
Duration    object
Discount     int32
dtype: object

4. Using pandas apply(np.int64) to Cast From Float to Integer

You can also use DataFrame.apply() method to convert Fee column from float to integer in pandas. As you see in this example we are use numpy.dtype (np.int64) .


import numpy as np
# convert "Fee" from float to int using DataFrame.apply(np.int64)
df["Fee"] = df["Fee"].apply(np.int64)
print(df.dtypes)

Yields below output.


Courses      object
Fee           int64
Duration     object
Discount    float64
dtype: object

5. Convert Pandas Column Containing NaNs to astype(int)

In order to demonstrate some NaN/Null values, let’s create a DataFrame using NaN Values. To convert a column that includes a mixture of float and NaN values to int, first replace NaN values with zero on pandas DataFrame and then use astype() to convert.


import pandas as pd
import numpy as np
technologies= {
    'Fee' :[22000.30,25000.40,np.nan,24000.50,26000.10,np.nan]
          }
df = pd.DataFrame(technologies)
print(df)
print(df.dtypes)

Use .fillna() to replace the NaN values with integer value zero. For Example df['Fee']=df['Fee'].fillna(0).astype(int) method.


# convert "Fee" from float to int and replace NaN values
df['Fee'] = df['Fee'].fillna(0).astype(int)
print(df)
print(df.dtypes)

Yields below output.


     Fee
0  22000
1  25000
2      0
3  24000
4  26000
5      0
Fee    int32
dtype: object

Conclusion

In this article, you have learned how to convert float to integer in pandas DataFrame using DataFrame.astype(int) and DataFrame.apply(np.int64) method. Also, you have learned how to convert float to integers when you have Nan/null values in a column.

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas – Convert Float to Integer in DataFrame