Pandas Convert String to Integer

To convert String to Int (Integer) from Pandas DataFrame or Series use Series.astype(int) or pandas.to_numeric() functions. In this article, I will explain how to convert one or multiple string columns to integer type with examples.

1. Quick Examples of Convert String to Integer

If you are in a hurry, below are some quick examples of how to convert or cast string to integer dtype.


# Below are quick example

# Example 1: convert string to an integer
df["Fee"] = df["Fee"].astype(int)
print (df.dtypes)

# Example 2: Change specific column type
df.Fee = df['Fee'].astype('int')
print(df.dtypes)

# Example 3: Multiple columns integer conversion
df[['Fee', 'Discount']] = df[['Fee','Discount']].astype(int)
print(df.dtypes)

# Example 4: convert the strings to integers use to_numeric
df['Fee'] = pd.to_numeric(df['Fee'])
print (df.dtypes)

# Example 5: convert the strings to integers using ste.replace & astype()
df['Fee'] = df['Fee'].str.replace('[^0-9]', '', regex=True).astype('int64')
print(df.dtypes)

2. Series.astype() Syntax

Following is a syntax of the Series.astype(). This function takes dtype, copy, and errors params.


# Astype() Syntax
Series.astype(dtype, copy=True, errors=’raise’)

2.1 Parameters of astype()

Following are the parameters of astype() function.

  • dtype – Accepts a numpy.dtype or Python type to cast entire pandas object to the same type.
  • copy – Default True. Return a copy when copy=True.
  • errors – Default raise
    • Use ‘raise’ to generate an exception when unable to cast due to invalid data for type.
    • Use ‘ignore’ to not raise exceptions (suppress errors/exceptions). On error return the original object.

2.2 Return value of astype()

It returns a Series with the changed data type.

Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are CoursesFee, Duration and Discount.


import pandas as pd
import numpy as np
technologies= ({
   'Courses':["Spark","PySpark","Hadoop","Pandas"],
    'Fee' :['22000','25000','24000','26000'],
    'Duration':['30days','50days','40days','60days'],
    'Discount':['1000','2300','2500','1400']
              })
df = pd.DataFrame(technologies)

print(df.dtypes)

Yields below output.


# Output:
Courses     object
Fee         object
Duration    object
Discount    object
dtype: object

2. Pandas Convert String to Integer

We can use Pandas Series.astype() to convert or cast a string to an integer in a specific DataFrame column or Series. Since each column on DataFrame is pandas Series, I will get the column from DataFrame as a Series and use astype() function. In the below example df.Fee or df['Fee'] returns Series object.

Use {col: dtype, …}, where col is a column label and dtype is a numpy.dtype or Python type to cast one or more of the DataFrame columns.


# Convert string to an integer
df["Fee"] = df["Fee"].astype(int)
print (df.dtypes)

# Change specific column type
df.Fee = df['Fee'].astype('int')
print(df.dtypes)

Yields below output.


# Output:
Courses     object
Fee          int32
Duration    object
Discount    object
dtype: object

3. Convert Multiple String Columns to Integer

We can also convert multiple string columns to integers by sending dict of column name data type to astype() function. The below example converts columns 'Fee','Discount' from string to integer dtype.


# Multiple columns integer conversion
df[['Fee', 'Discount']] = df[['Fee','Discount']].astype(int)
print(df.dtypes)

Yields below output.


# Output:
Courses     object
Fee          int32
Duration    object
Discount     int32
dtype: object

4. Using pandas.to_numeric()

Alternatively, you can convert all string columns to integer type in pandas using to_numeric(). For example use df['Fee'] = pd.to_numeric(df['Fee']) function to convert ‘Fee’ column to int.


# Using pandas.to_numeric()
df['Fee'] = pd.to_numeric(df['Fee'])
print (df.dtypes)

Yields below output.


# Output:
Courses     object
Fee          int64
Duration    object
Discount    object
dtype: object

If you don’t want to lose the values with letters in them, use str.replace() with a regex pattern to drop the non-digit characters.


# Convert the strings to integers using ste.replace & astype()
df['Fee'] = df['Fee'].str.replace('[^0-9]', '', regex=True).astype('int64')
print(df.dtypes)

Yields the same output as above.

5. Complete Example of Convert String to Integer


import pandas as pd
import numpy as np
technologies= ({
   'Courses':["Spark","PySpark","Hadoop","Pandas"],
    'Fee' :['22000','25000','24000','26000'],
    'Duration':['30days','50days','40days','60days'],
    'Discount':['1000','2300','2500','1400']
              })
df = Pd.DataFrame(technologies)
print(df)
print(df.dtypes)

# Convert string to an integer
df["Fee"] = df["Fee"].astype(int)
print (df.dtypes)

# Change specific column type
df.Fee = df['Fee'].astype('int')
print(df.dtypes)

# Multiple columns integer conversion
df[['Fee', 'Discount']] = df[['Fee','Discount']].astype(int)
print(df.dtypes)

# Convert the strings to integers use to_numeric
df['Fee'] = pd.to_numeric(df['Fee'])
print (df.dtypes)

# Convert the strings to integers using ste.replace & astype()
df['Fee'] = df['Fee'].str.replace('[^0-9]', '', regex=True).astype('int64')
print(df.dtypes)

6. Conclusion

In this article, I have explained how to convert single column, and multiple columns from string to integer type in Pandas DataFrame using Series.astype(int) and pandas.to_numeric() function.

Happy Learning !!

Related Articles

References

Malli

I am Mallikarjuna an experienced technical writer with a passion for translating complex Python concepts into clear, concise, and user-friendly documentation. Over the years, I have written hundreds of articles in Pandas, NumPy, Python, and I take pride in my ability to bridge the gap between technical experts and end-users by delivering well-structured, accessible, and informative content.

Leave a Reply

You are currently viewing Pandas Convert String to Integer