You can use reset_index()
to create/convert the index/multi-index to a column of pandas DataFrame. Besides this, there are other ways as well. If you are not aware by default, pandas add an index to each row of the pandas DataFrame. This index value starts with zero for the first row and increments by 1 for each row (sequence index value for each row). You can also assign a custom index to DataFrame according to your need.
In this article, I will explain different ways to convert the index to the DataFrame column with examples like adding default index and custom index as a column to DataFrame.
1. Quick Examples of Convert Index to Column in pandas DataFrame
Below are some useful examples of how to convert an index to a column in pandas DataFrame.
# Create a new column with index values
df['index_column'] = df.index
# Using reset_index() to convert index into column
df2=df.reset_index()
# Using reset_index, inplace=True
df.reset_index(inplace=True)
# When you have multiple index
# Specify which index you wanted to convert to column
df.reset_index(inplace=True, level=['Fee'])
# Preserve Index as-is and add one index as column
df2=df.assign(Fee=df.index.get_level_values('Fee'))
# Preserve Index as-is and add one index as column
# Using assign operator.
df['Fee'] = df.index.get_level_values('Fee')
Let’s create a DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000]
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000 30day 1000
1 PySpark 25000 40days 2300
2 Hadoop 26000 35days 1500
3 Python 22000 40days 1200
4 pandas 24000 60days 2500
5 Oracle 21000 50days 2100
6 Java 22000 55days 2000
2. Create a New Index Column in DataFrame
One simple way to convert an index to a column is by assigning an index as a new column to DataFrame. DataFrame.index
property returns a Series
object of an index, assign this to DataFrame as shown below which creates a new column with index value on DataFrame.
# Create a new column with index values
df['index'] = df.index
print(df)
Yields below output. It adds a new column index_column
with index values to DataFrame.
# Output:
Courses Fee Duration Discount index_column
0 Spark 20000 30day 1000 0
1 PySpark 25000 40days 2300 1
2 Hadoop 26000 35days 1500 2
3 Python 22000 40days 1200 3
4 pandas 24000 60days 2500 4
5 Oracle 21000 50days 2100 5
6 Java 22000 55days 2000 6
You can also add a new custom column as an index and convert it to a column
# Create DataFrame with custom index
index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies,index=index)
df['index_column'] = df.index
print(df)
Results
# Output:
Courses Fee Duration Discount index_column
r0 Spark 20000 30day 1000 r0
r1 PySpark 25000 40days 2300 r1
r2 Hadoop 26000 35days 1500 r2
r3 Python 22000 40days 1200 r3
r4 pandas 24000 60days 2500 r4
r5 Oracle 21000 50days 2100 r5
r6 Java 22000 55days 2000 r6
3. Convert the Index to Column Using DataFrame.reset_index()
Another way is by using the Dataframe.reset_index()
function to convert the index as a column in Pandas DataFrame.
# Using reset_index to convert index to column
df = pd.DataFrame(technologies,index=index)
df2=df.reset_index()
print(df2)
Yields below output. This adds a new column index
to DataFrame and returns a copy of the DataFrame instead of updating the existing DataFrame.
# Output:
index Courses Fee Duration Discount
0 r0 Spark 20000 30day 1000
1 r1 PySpark 25000 40days 2300
2 r2 Hadoop 26000 35days 1500
3 r3 Python 22000 40days 1200
4 r4 pandas 24000 60days 2500
5 r5 Oracle 21000 50days 2100
6 r6 Java 22000 55days 2000
Use inplace=True
parameter to reflects the change in the DataFrame to stay permanent. If you have multiple indexes, this converts all index levels to columns.
# Convert the index to column using reset_index
df.reset_index(inplace=True)
print(df)
Yields same output as above.
4. Convert the Index to Column Using Rename_Axis and Reset_Index
You can also convert the index to the column using rename_axis()
and reset_index()
function. You should first rename your index to the custom label and then apply reset_index()
.
# Using Rename_Axis and Reset_Index
df = pd.DataFrame(technologies,index=index)
df2 = df.rename_axis('index_column').reset_index()
print(df2)
Yields below output.
# Output:
index_column Courses Fee Duration Discount
0 r0 Spark 20000 30day 1000
1 r1 PySpark 25000 40days 2300
2 r2 Hadoop 26000 35days 1500
3 r3 Python 22000 40days 1200
4 r4 pandas 24000 60days 2500
5 r5 Oracle 21000 50days 2100
6 r6 Java 22000 55days 2000
5. Convert Multi_Index to Multiple Columns
Let’s look at another DataFrame with two levels in its index or multi-index.
# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
("PySpark", 25000),
("Hadoop", 26000),
("Python", 22000),
("pandas", 24000),
("Oracle",21000),
("Java", 23000)],
names=['Courses','Fee'])
Data = {
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000],
'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
}
df = pd.DataFrame(Data, columns = ['Duration','Discount','Address'],
index=new_index)
print(df)
Yields below results with two Indexes Courses
and Fee
# Output:
Duration Discount Address
Courses Fee
Spark 20000 30day 1000 Mumbai
PySpark 25000 40days 2300 Delhi
Hadoop 26000 35days 1500 Hyderabad
Python 22000 40days 1200 Bangalore
pandas 24000 60days 2500 Chennai
Oracle 21000 50days 2100 Kolkata
Java 23000 55days 2000 Lucknow
Let’s convert these multi-indexes as columns to DataFrame. The reset_index()
method with the default parameters converts all index levels to columns and uses a simple RangeIndex as a new index.
# Convert all multi indexes to columns
df.reset_index(inplace=True)
Yields below output. As you see it converted indexes Courses
and Fee
as DataFrame columns.
# Output:
Courses Fee Duration Discount Address
0 Spark 20000 30day 1000 Mumbai
1 PySpark 25000 40days 2300 Delhi
2 Hadoop 26000 35days 1500 Hyderabad
3 Python 22000 40days 1200 Bangalore
4 pandas 24000 60days 2500 Chennai
5 Oracle 21000 50days 2100 Kolkata
6 Java 23000 55days 2000 Lucknow
6. Using Multiple Columns to Convert Multi_Index and Level
Use the level
parameter to control which index to convert when you have multiple indexes on DataFrame. If possible, use the level name, which is more explicit.
# Using multiple columns to convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print(df)
Yields below output
# Output:
Fee Duration Discount Address
Courses
Spark 20000 30day 1000 Mumbai
PySpark 25000 40days 2300 Delhi
Hadoop 26000 35days 1500 Hyderabad
Python 22000 40days 1200 Bangalore
pandas 24000 60days 2500 Chennai
Oracle 21000 50days 2100 Kolkata
Java 23000 55days 2000 Lucknow
In case if you wanted to persist the index as-is and add one of the indexes as a column
df = pd.DataFrame(data,index=new_index)
df2=df.assign(Fee=df.index.get_level_values('Fee'))
print(df2)
# Also use
df['Fee'] = df.index.get_level_values('Fee')
print(df)
Yields below output.
# Output:
Duration Discount Address Fee
Courses Fee
Spark 20000 30day 1000 Mumbai 20000
PySpark 25000 40days 2300 Delhi 25000
Hadoop 26000 35days 1500 Hyderabad 26000
Python 22000 40days 1200 Bangalore 22000
pandas 24000 60days 2500 Chennai 24000
Oracle 21000 50days 2100 Kolkata 21000
Java 23000 55days 2000 Lucknow 23000
7. Compelte Example
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000]
}
df = pd.DataFrame(technologies)
print(df)
# Create a new index
df['index_column'] = df.index
print(df)
index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies,index=index)
df['index_column'] = df.index
print(df)
# Using reset_index
df = pd.DataFrame(technologies,index=index)
df2=df.reset_index()
print(df2)
# Using reset_index, inplace=True
df = pd.DataFrame(technologies,index=index)
df.reset_index(inplace=True)
print(df)
# Using rename_axis() and reset_index()
df = pd.DataFrame(technologies,index=index)
df = df.rename_axis('index_column').reset_index()
print(df)
# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
("PySpark", 25000),
("Hadoop", 26000),
("Python", 22000),
("pandas", 24000),
("Oracle",21000),
("Java", 23000)],
names=['Courses','Fee'])
data = {
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000],
'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
}
df = pd.DataFrame(data, columns = ['Duration','Discount','Address'],
index=new_index)
print(df)
# convert multi_index
df.reset_index(inplace=True)
print(df)
# Using multiple columns to convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print(df)
Conclusion
In this article, you have learned how to convert an index to a column in pandas DataFrame using index()
, rename_axis()
and reset_index()
function. Also, you have learned how to convert multiple indexes into columns use these methods.
Happy Learning !!
Related Articles
- Pandas – Convert Single or All Columns To String Type
- Easy Steps to Install Pandas on Windows
- Get a Single Cell Value From Pandas DataFrame
- How to Filter Rows using pandas.DataFrame.query() Method
- How to Add New Column to Existing DataFrame in pandas
- How to Create Pandas Pivot Multiple Columns
- Pandas Read Multiple CSV Files into DataFrame
- Pandas GroupBy Multiple Columns Explained
- Pandas Set Index to Column in DataFrame