• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:14 mins read
You are currently viewing Pandas Set Index to Column in DataFrame

In order to set the index to a column in pandas DataFrame use reset_index() method. By using this you can also set single, multiple indexes to a column. If you are not aware by default, pandas add an index to each row of the pandas DataFrame. This index value starts with zero for the first row and increments by 1 for each row (sequence index value for each row).

In this article, I will explain different ways to set the index as a DataFrame column with examples like adding a default index and custom index as a column to DataFrame etc.

1. Quick Examples of Setting Index to Column in DataFrame

Below are some useful examples of setting an index to a column in Pandas DataFrame.


# Below are the quick examples.

# Example 1: Create a new column with index values
df['index_column'] = df.index


# Example 2: Using reset_index() to set index into column
df2 = df.reset_index()


# Example 3: Using reset_index, inplace=True
df.reset_index(inplace=True)


# Example 4: When you have multiple index 
# Specify which index you wanted to set to column
df.reset_index(inplace=True, level = ['Fee'])


# Example 5: Preserve Index as-is and add one index as column
df2 = df.assign(Fee = df.index.get_level_values('Fee'))


# Example 6: Preserve Index as-is and add one index as column
# Using assign operator.
df['Fee'] = df.index.get_level_values('Fee') 

Let’s create a DataFrame with a few rows and columns, execute these examples and validate the results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


# Create DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000]
              }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

Pandas set Index to Column

2. Set Index to Column in DataFrame

One simple way to set an index to a column is by assigning an index as a new column to Pandas DataFrame. DataFrame.index property returns a Series object of an index, assign this to DataFrame as shown below which creates a new column with index value on DataFrame.


# Create a new column with index values
df['index'] = df.index
print("Set indexes to a new column:\n", df)

Yields below output. It adds a new column index_column with index values to DataFrame.

Pandas set Index to Column

You can also add a new custom column as an index and set it to a column


# Create DataFrame with custom index
index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies, index = index)
df['index_column'] = df.index
print("Set indexes to a new column:\n", df)

Results


# Output:
# Set indexes to a new column:
    Courses    Fee Duration  Discount index_column
r0    Spark  20000    30day      1000           r0
r1  PySpark  25000   40days      2300           r1
r2   Hadoop  26000   35days      1500           r2
r3   Python  22000   40days      1200           r3
r4   pandas  24000   60days      2500           r4
r5   Oracle  21000   50days      2100           r5
r6     Java  22000   55days      2000           r6

3. Convert the Index to Column using reset_index()

Alternatively, you can use the Pandas.Dataframe.reset_index() function to convert the index as a column. This function returns a copy of the DataFrame with a newly added index column instead of updating the existing DataFrame.


# Using reset_index to convert index to column
df = pd.DataFrame(technologies, index = index)
df2=df.reset_index()
print("Set indexes to a new column:\n", df)

Yields below output.


# Output:
# Set indexes to a new column:
  index  Courses    Fee Duration  Discount
0    r0    Spark  20000    30day      1000
1    r1  PySpark  25000   40days      2300
2    r2   Hadoop  26000   35days      1500
3    r3   Python  22000   40days      1200
4    r4   pandas  24000   60days      2500
5    r5   Oracle  21000   50days      2100
6    r6     Java  22000   55days      2000

Use inplace=True parameter to reflect the change in the DataFrame to stay permanent. If you have multiple indexes, this converts all index levels to columns.


# Convert the index to column using reset_index
df.reset_index(inplace = True)
print("Set indexes to a new column:\n", df)

Yields the same output as above.

4. Convert the Index to Column Using rename_axis and reset_index

To convert the index to the column use rename_axis() and reset_index() function. You should first rename your index to the custom label and then apply reset_index().


# Using rename_axis and reset_index
df = pd.DataFrame(technologies, index = index)
df2 = df.rename_axis('index_column').reset_index()
print("Set indexes to a new column:\n", df)

Yields below output.


# Output:
# Set indexes to a new column:
  index_column  Courses    Fee Duration  Discount
0           r0    Spark  20000    30day      1000
1           r1  PySpark  25000   40days      2300
2           r2   Hadoop  26000   35days      1500
3           r3   Python  22000   40days      1200
4           r4   pandas  24000   60days      2500
5           r5   Oracle  21000   50days      2100
6           r6     Java  22000   55days      2000

5. Convert Multi-Index to Multiple Columns

Let’s look at another DataFrame with two levels in its index or multi-index.


# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
                                       ("PySpark", 25000),
                                       ("Hadoop", 26000),
                                       ("Python", 22000),
                                       ("pandas", 24000),
                                       ("Oracle",21000),
                                       ("Java", 23000)],
                                       names=['Courses','Fee'])
Data = {
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000],
    'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
       }
df = pd.DataFrame(Data, columns = ['Duration','Discount','Address'],
                  index=new_index)
print("Create DataFrame:\n", df)

Yields below results with two Indexes Courses and Fee.


# Output:
# Create DataFrame:
              Duration  Discount    Address
Courses Fee                                
Spark   20000    30day      1000     Mumbai
PySpark 25000   40days      2300      Delhi
Hadoop  26000   35days      1500  Hyderabad
Python  22000   40days      1200  Bangalore
pandas  24000   60days      2500    Chennai
Oracle  21000   50days      2100    Kolkata
Java    23000   55days      2000    Lucknow

Let’s convert these multi-indexes as columns to DataFrame. The reset_index() method with the default parameters converts all index levels to columns and uses a simple RangeIndex as a new index.


# Convert all multi indexes to columns
df.reset_index(inplace=True)
print("Convert multi indexes to columns:\n", df)

Yields below output. As you see converted indexes Courses and Fee as DataFrame columns.


# Output:
# Convert multi indexes to columns:
   Courses    Fee Duration  Discount    Address
0    Spark  20000    30day      1000     Mumbai
1  PySpark  25000   40days      2300      Delhi
2   Hadoop  26000   35days      1500  Hyderabad
3   Python  22000   40days      1200  Bangalore
4   pandas  24000   60days      2500    Chennai
5   Oracle  21000   50days      2100    Kolkata
6     Java  23000   55days      2000    Lucknow

6. Using Multiple Columns to Convert Multi-Index and Level

Use the level parameter to control which index to convert when you have multiple indexes on DataFrame. If possible, use the level name, which is more explicit.


# Using multiple columns to convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print("Convert specified index to column:\n", df)

Yields below output


# Output:
# Convert specified index to column:
           Fee Duration  Discount    Address
Courses                                     
Spark    20000    30day      1000     Mumbai
PySpark  25000   40days      2300      Delhi
Hadoop   26000   35days      1500  Hyderabad
Python   22000   40days      1200  Bangalore
pandas   24000   60days      2500    Chennai
Oracle   21000   50days      2100    Kolkata
Java     23000   55days      2000    Lucknow

In case, if you want to persist the index as it is and convert one of the indexes as a column. For example,


# Convert specified index as a column
df = pd.DataFrame(data,index=new_index)
df2=df.assign(Fee=df.index.get_level_values('Fee'))
print("Convert specified index to column:\n", df2)

# Also use
df['Fee'] = df.index.get_level_values('Fee')
print("Convert specified index to column:\n", df)

Yields below output.


# Output:
# Convert specified index to column
              Duration  Discount    Address    Fee
Courses Fee                                       
Spark   20000    30day      1000     Mumbai  20000
PySpark 25000   40days      2300      Delhi  25000
Hadoop  26000   35days      1500  Hyderabad  26000
Python  22000   40days      1200  Bangalore  22000
pandas  24000   60days      2500    Chennai  24000
Oracle  21000   50days      2100    Kolkata  21000
Java    23000   55days      2000    Lucknow  23000

7. Complete Examples


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000]
              }

df = pd.DataFrame(technologies)
print(df)

# Create a new index
df['index_column'] = df.index
print(df)

index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies,index=index)
df['index_column'] = df.index
print(df)

# Using reset_index
df = pd.DataFrame(technologies,index=index)
df2=df.reset_index()
print(df2)

# Using reset_index, inplace=True
df = pd.DataFrame(technologies,index=index)
df.reset_index(inplace=True)
print(df)


# Using rename_axis() and reset_index()
df = pd.DataFrame(technologies,index=index)
df = df.rename_axis('index_column').reset_index()
print(df)

# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
                                       ("PySpark", 25000),
                                       ("Hadoop", 26000),
                                       ("Python", 22000),
                                       ("pandas", 24000),
                                       ("Oracle",21000),
                                       ("Java", 23000)],
                                       names=['Courses','Fee'])
data = {
    'Duration':['30day','40days','35days','40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000],
    'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
       }
df = pd.DataFrame(data, columns = ['Duration','Discount','Address'],
                  index=new_index)
print(df)

# convert multi_index
df.reset_index(inplace=True)
print(df)

# Using multiple columns to convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print(df)

8. Conclusion

In this article, you have learned how to set or convert an index to a column in Pandas DataFrame using index(), rename_axis(), and reset_index() function. Also, you have learned how to convert multiple indexes into columns by using these methods.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium