• Post author:
  • Post category:Pandas
  • Post last modified:December 9, 2024
  • Reading time:19 mins read
You are currently viewing Pandas – Convert Index to Column in DataFrame

You can use reset_index() to create/convert the index/multi-index to a column of pandas DataFrame. Besides this, there are other ways as well. If you are not aware by default, pandas add an index to each row of the pandas DataFrame. This index value starts with zero for the first row and increments by 1 for each row (sequence index value for each row). You can also assign a custom index to DataFrame according to your need.

Advertisements

In this article, I will explain different ways to convert the index to the DataFrame column with examples like adding default index and custom index as a column to DataFrame.

Key Points –

  • The primary method to convert the index into a column is using reset_index(), which moves the index into a regular column.
  • By default, reset_index() moves the index to a new column named index.
  • The reset_index() method can be used with the inplace=True parameter to modify the DataFrame without creating a new one.
  • If the DataFrame has a MultiIndex, all levels of the index will be converted to separate columns when calling reset_index().
  • You can specify specific index levels to reset by passing the level(s) argument to reset_index() when dealing with MultiIndex DataFrames.
  • After resetting the index, you may need to rename the new index column for clarity or consistency with other columns.

1. Quick Examples of Convert Index to Column in Pandas DataFrame

Below are some useful examples of how to convert an index to a column in pandas DataFrame.


# Quick examples of convert index to column

# Create a new column with index values
df['index_column'] = df.index

# Using reset_index() 
# To convert index into column
df2=df.reset_index()

# Using reset_index, inplace=True
df.reset_index(inplace=True)

# When you have multiple index 
# Specify which index you wanted to convert to column
df.reset_index(inplace=True, level=['Fee'])

# Preserve Index as-is and add one index as column
df2 = df.assign(Fee=df.index.get_level_values('Fee'))

# Preserve Index as-is and add one index as column
# Using assign operator
df['Fee'] = df.index.get_level_values('Fee') 

Let’s create a DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000]
              }
df = pd.DataFrame(technologies)
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
0    Spark  20000    30day      1000
1  PySpark  25000   40days      2300
2   Hadoop  26000   35days      1500
3   Python  22000   40days      1200
4   pandas  24000   60days      2500
5   Oracle  21000   50days      2100
6     Java  22000   55days      2000

2. Create a New Index Column in DataFrame

One simple way to convert an index to a column is by assigning an index as a new column to DataFrame. DataFrame.index property returns a Series object of an index, assign this to DataFrame as shown below which creates a new column with index value on DataFrame.


# Create a new column with index values
df['index'] = df.index
print(df)

Yields below output. It adds a new column index_column with index values to DataFrame.


# Output:
    Courses    Fee Duration  Discount index_column
0     Spark  20000    30day      1000            0
1   PySpark  25000   40days      2300            1
2    Hadoop  26000   35days      1500            2
3    Python  22000   40days      1200            3
4    pandas  24000   60days      2500            4
5    Oracle  21000   50days      2100            5
6      Java  22000   55days      2000            6

You can also add a new custom column as an index and convert it to a column


# Create DataFrame with custom index
index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies,index=index)
df['index_column'] = df.index
print(df)

Results


# Output:
    Courses    Fee Duration  Discount index_column
r0    Spark  20000    30day      1000           r0
r1  PySpark  25000   40days      2300           r1
r2   Hadoop  26000   35days      1500           r2
r3   Python  22000   40days      1200           r3
r4   pandas  24000   60days      2500           r4
r5   Oracle  21000   50days      2100           r5
r6     Java  22000   55days      2000           r6

3. Convert the Index to Column Using DataFrame.reset_index()

Another way is by using the Dataframe.reset_index() function to convert the index as a column in Pandas DataFrame.


# Using reset_index to convert index to column
df = pd.DataFrame(technologies,index=index)
df2=df.reset_index()
print(df2)

Yields below output. This adds a new column index to DataFrame and returns a copy of the DataFrame instead of updating the existing DataFrame.


# Output:
  index  Courses    Fee Duration  Discount
0    r0    Spark  20000    30day      1000
1    r1  PySpark  25000   40days      2300
2    r2   Hadoop  26000   35days      1500
3    r3   Python  22000   40days      1200
4    r4   pandas  24000   60days      2500
5    r5   Oracle  21000   50days      2100
6    r6     Java  22000   55days      2000

Use inplace=True parameter to reflects the change in the DataFrame to stay permanent. If you have multiple indexes, this converts all index levels to columns.


# Convert the index to column 
# Using reset_index
df.reset_index(inplace=True)
print(df)

Yields same output as above.

4. Convert the Index to Column Using Rename_Axis and Reset_Index

You can also convert the index to the column using rename_axis() and reset_index() function. You should first rename your index to the custom label and then apply reset_index().


# Using Rename_Axis and Reset_Index
df = pd.DataFrame(technologies,index=index)
df2 = df.rename_axis('index_column').reset_index()
print(df2)

Yields below output.


# Output:
  index_column  Courses    Fee Duration  Discount
0           r0    Spark  20000    30day      1000
1           r1  PySpark  25000   40days      2300
2           r2   Hadoop  26000   35days      1500
3           r3   Python  22000   40days      1200
4           r4   pandas  24000   60days      2500
5           r5   Oracle  21000   50days      2100
6           r6     Java  22000   55days      2000

5. Convert Multi_Index to Multiple Columns

Let’s look at another DataFrame with two levels in its index or multi-index.


# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
                                       ("PySpark", 25000),
                                       ("Hadoop", 26000),
                                       ("Python", 22000),
                                       ("pandas", 24000),
                                       ("Oracle",21000),
                                       ("Java", 23000)],
                                       names=['Courses','Fee'])
Data = {
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000],
    'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
       }
df = pd.DataFrame(Data, columns = ['Duration','Discount','Address'],
                  index=new_index)
print(df)

Yields below results with two Indexes Courses and Fee


# Output:
              Duration  Discount    Address
Courses Fee                                
Spark   20000    30day      1000     Mumbai
PySpark 25000   40days      2300      Delhi
Hadoop  26000   35days      1500  Hyderabad
Python  22000   40days      1200  Bangalore
pandas  24000   60days      2500    Chennai
Oracle  21000   50days      2100    Kolkata
Java    23000   55days      2000    Lucknow

Let’s convert these multi-indexes as columns to DataFrame. The reset_index() method with the default parameters converts all index levels to columns and uses a simple RangeIndex as a new index.


# Convert all multi indexes to columns
df.reset_index(inplace=True)

Yields below output. As you see it converted indexes Courses and Fee as DataFrame columns.


# Output:
   Courses    Fee Duration  Discount    Address
0    Spark  20000    30day      1000     Mumbai
1  PySpark  25000   40days      2300      Delhi
2   Hadoop  26000   35days      1500  Hyderabad
3   Python  22000   40days      1200  Bangalore
4   pandas  24000   60days      2500    Chennai
5   Oracle  21000   50days      2100    Kolkata
6     Java  23000   55days      2000    Lucknow

6. Using Multiple Columns to Convert Multi_Index and Level

Use the level parameter to control which index to convert when you have multiple indexes on DataFrame. If possible, use the level name, which is more explicit.


# Using multiple columns 
# To convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print(df)

Yields below output


# Output:
           Fee Duration  Discount    Address
Courses                                     
Spark    20000    30day      1000     Mumbai
PySpark  25000   40days      2300      Delhi
Hadoop   26000   35days      1500  Hyderabad
Python   22000   40days      1200  Bangalore
pandas   24000   60days      2500    Chennai
Oracle   21000   50days      2100    Kolkata
Java     23000   55days      2000    Lucknow

In case if you wanted to persist the index as-is and add one of the indexes as a column


df = pd.DataFrame(data,index=new_index)
df2=df.assign(Fee=df.index.get_level_values('Fee'))
print(df2)

# Also use
df['Fee'] = df.index.get_level_values('Fee')
print(df) 

Yields below output.


# Output:
              Duration  Discount    Address    Fee
Courses Fee                                       
Spark   20000    30day      1000     Mumbai  20000
PySpark 25000   40days      2300      Delhi  25000
Hadoop  26000   35days      1500  Hyderabad  26000
Python  22000   40days      1200  Bangalore  22000
pandas  24000   60days      2500    Chennai  24000
Oracle  21000   50days      2100    Kolkata  21000
Java    23000   55days      2000    Lucknow  23000

Compelte Example


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000]
              }

df = pd.DataFrame(technologies)
print(df)

# Create a new index
df['index_column'] = df.index
print(df)

index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies,index=index)
df['index_column'] = df.index
print(df)

# Using reset_index
df = pd.DataFrame(technologies,index=index)
df2=df.reset_index()
print(df2)

# Using reset_index, inplace=True
df = pd.DataFrame(technologies,index=index)
df.reset_index(inplace=True)
print(df)

# Using rename_axis() and reset_index()
df = pd.DataFrame(technologies,index=index)
df = df.rename_axis('index_column').reset_index()
print(df)

# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
                                       ("PySpark", 25000),
                                       ("Hadoop", 26000),
                                       ("Python", 22000),
                                       ("pandas", 24000),
                                       ("Oracle",21000),
                                       ("Java", 23000)],
                                       names=['Courses','Fee'])
data = {
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000],
    'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
       }
df = pd.DataFrame(data, columns = ['Duration','Discount','Address'],
                  index=new_index)
print(df)

# convert multi_index
df.reset_index(inplace=True)
print(df)

# Using multiple columns 
# To convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print(df)

FAQ on Convert Index to Column in Pandas DataFrame

How do I convert the index to a column in Pandas?

To convert the index to a column in a Pandas DataFrame, you can use the reset_index() function. This function moves the current index to a column and replaces it with a default integer index.

What does reset_index() do?

reset_index() moves the index of a DataFrame into a column and creates a new default integer-based index. If inplace=True, it modifies the DataFrame directly. Otherwise, it returns a new DataFrame.

Can I convert only the index and leave the rest of the DataFrame intact?

When using reset_index(), it will convert the index to a column without affecting other columns. You can also specify which level of a MultiIndex to reset by passing the level parameter.

Can I rename the index column?

You can rename the index column after using reset_index() in Pandas. By default, when you reset the index, the column that was once the index is named “index”. You can rename it using the rename() method.

What if I have a MultiIndex?

If your DataFrame has a MultiIndex (multiple index levels), you can use reset_index() to convert one or more index levels into columns. Use the level argument to specify which level(s) to reset

Conclusion

In this article, you have learned how to convert an index to a column in pandas DataFrame using index(), rename_axis() and reset_index() function. Also, you have learned how to convert multiple indexes into columns use these methods.

Happy Learning !!

References