• Post author:
  • Post category:Pandas
  • Post last modified:October 13, 2024
  • Reading time:18 mins read
You are currently viewing Pandas Convert Column to Float in DataFrame

By using pandas DataFrame.astype() and pandas.to_numeric() methods you can convert a column from string/int type to float. In this article, I will explain how to convert one or multiple string columns to float type using examples.

Advertisements

Key Points –

  • Use pd.to_numeric() to convert a column to numeric type.
  • Use astype(float) for straightforward conversion if data is clean.
  • Handle string formatting issues like commas or currency symbols beforehand.
  • Specify errors='coerce' to force non-convertible values to NaN.
  • Consider using regex=True with pd.to_numeric() for more complex conversions.
  • Use pd.to_numeric() with downcast='float' to convert to float32 for memory efficiency.

Quick Examples of Converting String to Float in DataFrame

If you are in a hurry, below are some quick examples of how to convert string to float. You can apply these to convert from any type in Pandas.


# Quick examples of converting string to float

# Example 1: Convert "Fee" from string to float
df['Fee'] = df['Fee'].astype(float)
print(df.dtypes)

# Example 2: Convert multiple columns
df = df.astype({'Fee':'float','Discount':'float'})

# Example 3: Convert all columns to floats
df = df.astype(float)
print(df.dtypes)

# Example 4: Convert numeric function string to float
df['Discount'] = pd.to_numeric(df['Discount'])
print(df.dtypes)

# Example 5: Convert DataFrame column from string to float
df["Discount"] = pd.to_numeric(df["Discount"], downcast="float")
print(df.dtypes)

# Example 6: Convert each value of the column to a string
df['Discount'] = pd.to_numeric(df['Discount'], errors='coerce')
print(df.dtypes)

# Example 7: Using df.replace() to replace nan values 0 before convertion
df['Discount'] = pd.to_numeric(df['Discount'], errors='coerce')
df = df.replace(np.nan, 0, regex=True)
print(df.dtypes)

# Example 8: Replace empty string ('') with np.nan before convertion
df['Discount']=df.Discount.replace('',np.nan).astype(float)
print(df.dtypes)

Now, let’s create a DataFrame with a few rows and columns, execute the above examples, and validate the results. Our DataFrame contains column names Fee and Discount.


# Create DataFrame
import pandas as pd
import numpy as np
technologies= ({
         'Fee' :['22000.30','25000.40','23000.20','24000.50','26000.10'],
         'Discount':['1000.10',np.nan,'1000.5',np.nan,'2500.20']
             })
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
print("-----------------------------------")
print("Type of the columns:\n", df.dtypes)

You can identify the data type of each column by using dtypes. For Instance, print(df.dtypes) the outputs are as below. Here object means a String type.

Pandas Convert column Float

Pandas Convert String to Float

You can use the Pandas DataFrame.astype() function to convert a column from string/int to float, you can apply this on a specific column or on an entire DataFrame. To cast the data type to a 54-bit signed float, you can use numpy.float64, numpy.float_float, float64 as param. To cast to 32-bit signed float use numpy.float32 or float32.

The Below example converts Fee column from string dtype to float64.


# Convert "Fee" from string to float
df = df.astype({'Fee':'float'})
print("Convert specific column to float type:\n", df)
print("-----------------------------------")
print("Type of the columns:\n", df.dtypes)

Yields below output.

Pandas Convert column Float

You can also use the Series.astype() to convert a specific column. since each column on DataFrame is a pandas Series, I will get the column from DataFrame as a Series and use astype() function. In the below. example df.Fee or df['Fee'] returns Series object.


# Convert "Fee" from string to float
df['Fee'] = df['Fee'].astype(float)
print(df.dtypes)

Yields the same output as above.

Convert Multiple Columns to Float

You can also convert multiple columns to float by sending a dict of column name: data type to astype() the method. The below example converts the column Fee and Discount to float dtype.


# Convert multiple columns 
df = df.astype({'Fee':'float','Discount':'float'})
print("Convert multiple columns to float type:")
print("Type of the columns:\n", df.dtypes)

# Output:
# Convert multiple columns to float type:
# Type of the columns:
#  Fee         float64
# Discount    float64
# dtype: object

Convert All Columns to Float Type

By default astype() function converts all columns to the same type. The below example converts all DataFrame columns to float type. If you have any column with alpha-numeric values, you will get an error.


# Convert entire DataFrame string to float
df = df.astype(float)
print("Convert all columns to float type:)
print("Type of the columns:\n", df.dtypes)

Yields below output.


# Output:
# Convert all columns to float type:
# Type of the columns:
Fee         float64
Discount    float64
dtype: object

Using pandas.to_numeric()

Alternatively, you can convert all string columns to float type using pandas.to_numeric(). For example use df['Discount'] = pd.to_numeric(df['Discount']) function to convert ‘Discount’ column to float.


# Convert numeric function string to float
df['Discount'] = pd.to_numeric(df['Discount'])
print("Type of the columns:\n", df.dtypes)

# Convert DataFrame column from string to float
df["Discount"] = pd.to_numeric(df["Discount"], downcast="float")
print("Type of the columns:\n", df.dtypes)

Yields below output.


# Output:
# Type of the columns:
Fee          object
Discount    float64
dtype: object

Handling Non-numeric Values

When you have some cells with character values on a column you want to convert to float, it returns an error. To ignore the error and convert the char values to NaN use errors='coerce' attribute.


# Convert each value of the column to a string
df['Discount'] = pd.to_numeric(df['Discount'], errors='coerce')
print("Type of the columns:\n", df.dtypes)

This yields the same output as above.

Replace the ‘NaN’ Values with Zeros

Use df=df.replace(np.nan,0,regex=True) function to replace the ‘NaN’ values with ‘0’ values.


# Using df.replace() to replace nan values 0
df['Discount'] = pd.to_numeric(df['Discount'], errors='coerce')
df = df.replace(np.nan, 0, regex=True)
print("After replacing NaN with zeros:\n", df)
print("Type of the columns:\n", df.dtypes)

Yields below output.


# Output:
# After replacing NaN with zeros:
        Fee  Discount
0  22000.30    1000.1
1  25000.40       0.0
2  23000.20    1000.5
3  24000.50       0.0
4  26000.10    2500.2

# Type of the columns:
Fee          object
Discount    float64
dtype: object

Replace Empty String before Convert

If you have empty values in a string, convert the empty string ('') with np.nan before converting it to float.


import pandas as pd
import numpy as np
technologies= ({
         'Fee' :['22000.30','25000.40','23000.20','24000.50','26000.10','21000'],
         'Discount':['1000.10',np.nan,"",np.nan,'2500.20',""]
             })
df = pd.DataFrame(technologies)
# Replace empty string ('') with np.nan
df['Discount']=df.Discount.replace('',np.nan).astype(float)
print("After replacing an empty string with NaN:\n", df)
print("Type of the columns:\n", df.dtypes)

Yields below output.


# Output:
# After replacing an empty string with NaN:
        Fee  Discount
0  22000.30    1000.1
1  25000.40       NaN
2  23000.20       NaN
3  24000.50       NaN
4  26000.10    2500.2
5     21000       NaN

# Type of the columns:
Fee          object
Discount    float64
dtype: object

Frequently Asked Questions on Convert Pandas Column to Float

How do I convert a specific string column to a float in Pandas?

You can use the astype() method to convert a specific string column to a string. For example, df['specified_column'] = df['specified_col'].astype(float)

How can I convert multiple string columns to float at once?

You can convert multiple columns simultaneously using the Pandas astype() method. For example, df[['Column1', 'Column2']] = df[['Column1', 'Column2']].astype(float)

What if my string column contains non-numeric values?

If the string column contains non-numeric values, the conversion will raise a ValueError. You may need to handle or clean the non-numeric values before converting.

How do I handle missing values (NaN) during the conversion?

If you have NaN values in your column, you can use the pd.to_numeric function with the errors='coerce' parameter to handle missing values. For example, df['ColumnName'] = pd.to_numeric(df['ColumnName'], errors='coerce').astype(float)

How can I handle commas or other non-numeric characters in my string column?

If the string column contains non-numeric characters like commas, you may need to remove them before conversion. For example, df['ColumnName'] = df['ColumnName'].str.replace(',', '').astype(float)

How can I convert all string columns in the DataFrame to float in one go?

You can convert all string columns in a DataFrame to float using the astype() method along with the select_dtypes method. For example, df = df.astype(float, errors='ignore')

Conclusion

In this article, you have learned how to convert single, multiple, and all columns from string type to float in Pandas DataFrame using DataFrame.astype(float) and pandas.to_numeric() function with multiple examples.

Happy Learning !!

References

Leave a Reply

This Post Has 3 Comments

  1. Kelly Cox

    This is *almost* what I want to do! I want to convert a column from string to float–but my column header, call it my field, is text that I want to remain as text. Somehow I need to slice that column during the conversion.

  2. NNK

    Marco, Glad it was helpful.

  3. Marco

    Thank you very much for your help !