In order to set the index to a column in pandas DataFrame use reset_index()
method. By using this you can also set single, multiple indexes to a column. If you are not aware by default, pandas add an index to each row of the pandas DataFrame. This index value starts with zero for the first row and increments by 1 for each row (sequence index value for each row).
In this article, I will explain different ways to set the index as a DataFrame column with examples like adding a default index and custom index as a column to DataFrame etc.
1. Quick Examples of Setting Index to Column in DataFrame
Below are some useful examples of setting an index to a column in Pandas DataFrame.
# Below are the quick examples.
# Example 1: Create a new column with index values
df['index_column'] = df.index
# Example 2: Using reset_index() to set index into column
df2 = df.reset_index()
# Example 3: Using reset_index, inplace=True
df.reset_index(inplace=True)
# Example 4: When you have multiple index
# Specify which index you wanted to set to column
df.reset_index(inplace=True, level = ['Fee'])
# Example 5: Preserve Index as-is and add one index as column
df2 = df.assign(Fee = df.index.get_level_values('Fee'))
# Example 6: Preserve Index as-is and add one index as column
# Using assign operator.
df['Fee'] = df.index.get_level_values('Fee')
Let’s create a DataFrame with a few rows and columns, execute these examples and validate the results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
Yields below output.
2. Set Index to Column in DataFrame
One simple way to set an index to a column is by assigning an index as a new column to Pandas DataFrame. DataFrame.index
property returns a Series
object of an index, assign this to DataFrame as shown below which creates a new column with index value on DataFrame.
# Create a new column with index values
df['index'] = df.index
print("Set indexes to a new column:\n", df)
Yields below output. It adds a new column index_column
with index values to DataFrame.
You can also add a new custom column as an index and set it to a column
# Create DataFrame with custom index
index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies, index = index)
df['index_column'] = df.index
print("Set indexes to a new column:\n", df)
Results
# Output:
# Set indexes to a new column:
Courses Fee Duration Discount index_column
r0 Spark 20000 30day 1000 r0
r1 PySpark 25000 40days 2300 r1
r2 Hadoop 26000 35days 1500 r2
r3 Python 22000 40days 1200 r3
r4 pandas 24000 60days 2500 r4
r5 Oracle 21000 50days 2100 r5
r6 Java 22000 55days 2000 r6
3. Convert the Index to Column using reset_index()
Alternatively, you can use the Pandas.Dataframe.reset_index() function to convert the index as a column. This function returns a copy of the DataFrame with a newly added index column instead of updating the existing DataFrame.
# Using reset_index to convert index to column
df = pd.DataFrame(technologies, index = index)
df2=df.reset_index()
print("Set indexes to a new column:\n", df)
Yields below output.
# Output:
# Set indexes to a new column:
index Courses Fee Duration Discount
0 r0 Spark 20000 30day 1000
1 r1 PySpark 25000 40days 2300
2 r2 Hadoop 26000 35days 1500
3 r3 Python 22000 40days 1200
4 r4 pandas 24000 60days 2500
5 r5 Oracle 21000 50days 2100
6 r6 Java 22000 55days 2000
Use inplace=True
parameter to reflect the change in the DataFrame to stay permanent. If you have multiple indexes, this converts all index levels to columns.
# Convert the index to column using reset_index
df.reset_index(inplace = True)
print("Set indexes to a new column:\n", df)
Yields the same output as above.
4. Convert the Index to Column Using rename_axis and reset_index
To convert the index to the column use rename_axis()
and reset_index()
function. You should first rename your index to the custom label and then apply reset_index()
.
# Using rename_axis and reset_index
df = pd.DataFrame(technologies, index = index)
df2 = df.rename_axis('index_column').reset_index()
print("Set indexes to a new column:\n", df)
Yields below output.
# Output:
# Set indexes to a new column:
index_column Courses Fee Duration Discount
0 r0 Spark 20000 30day 1000
1 r1 PySpark 25000 40days 2300
2 r2 Hadoop 26000 35days 1500
3 r3 Python 22000 40days 1200
4 r4 pandas 24000 60days 2500
5 r5 Oracle 21000 50days 2100
6 r6 Java 22000 55days 2000
5. Convert Multi-Index to Multiple Columns
Let’s look at another DataFrame with two levels in its index or multi-index.
# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
("PySpark", 25000),
("Hadoop", 26000),
("Python", 22000),
("pandas", 24000),
("Oracle",21000),
("Java", 23000)],
names=['Courses','Fee'])
Data = {
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000],
'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
}
df = pd.DataFrame(Data, columns = ['Duration','Discount','Address'],
index=new_index)
print("Create DataFrame:\n", df)
Yields below results with two Indexes Courses
and Fee
.
# Output:
# Create DataFrame:
Duration Discount Address
Courses Fee
Spark 20000 30day 1000 Mumbai
PySpark 25000 40days 2300 Delhi
Hadoop 26000 35days 1500 Hyderabad
Python 22000 40days 1200 Bangalore
pandas 24000 60days 2500 Chennai
Oracle 21000 50days 2100 Kolkata
Java 23000 55days 2000 Lucknow
Let’s convert these multi-indexes as columns to DataFrame. The reset_index()
method with the default parameters converts all index levels to columns and uses a simple RangeIndex as a new index.
# Convert all multi indexes to columns
df.reset_index(inplace=True)
print("Convert multi indexes to columns:\n", df)
Yields below output. As you see converted indexes Courses
and Fee
as DataFrame columns.
# Output:
# Convert multi indexes to columns:
Courses Fee Duration Discount Address
0 Spark 20000 30day 1000 Mumbai
1 PySpark 25000 40days 2300 Delhi
2 Hadoop 26000 35days 1500 Hyderabad
3 Python 22000 40days 1200 Bangalore
4 pandas 24000 60days 2500 Chennai
5 Oracle 21000 50days 2100 Kolkata
6 Java 23000 55days 2000 Lucknow
6. Using Multiple Columns to Convert Multi-Index and Level
Use the level
parameter to control which index to convert when you have multiple indexes on DataFrame. If possible, use the level name, which is more explicit.
# Using multiple columns to convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print("Convert specified index to column:\n", df)
Yields below output
# Output:
# Convert specified index to column:
Fee Duration Discount Address
Courses
Spark 20000 30day 1000 Mumbai
PySpark 25000 40days 2300 Delhi
Hadoop 26000 35days 1500 Hyderabad
Python 22000 40days 1200 Bangalore
pandas 24000 60days 2500 Chennai
Oracle 21000 50days 2100 Kolkata
Java 23000 55days 2000 Lucknow
In case, if you want to persist the index as it is and convert one of the indexes as a column. For example,
# Convert specified index as a column
df = pd.DataFrame(data,index=new_index)
df2=df.assign(Fee=df.index.get_level_values('Fee'))
print("Convert specified index to column:\n", df2)
# Also use
df['Fee'] = df.index.get_level_values('Fee')
print("Convert specified index to column:\n", df)
Yields below output.
# Output:
# Convert specified index to column
Duration Discount Address Fee
Courses Fee
Spark 20000 30day 1000 Mumbai 20000
PySpark 25000 40days 2300 Delhi 25000
Hadoop 26000 35days 1500 Hyderabad 26000
Python 22000 40days 1200 Bangalore 22000
pandas 24000 60days 2500 Chennai 24000
Oracle 21000 50days 2100 Kolkata 21000
Java 23000 55days 2000 Lucknow 23000
7. Complete Examples
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000]
}
df = pd.DataFrame(technologies)
print(df)
# Create a new index
df['index_column'] = df.index
print(df)
index=['r0','r1','r2','r3','r4','r5','r6']
df = pd.DataFrame(technologies,index=index)
df['index_column'] = df.index
print(df)
# Using reset_index
df = pd.DataFrame(technologies,index=index)
df2=df.reset_index()
print(df2)
# Using reset_index, inplace=True
df = pd.DataFrame(technologies,index=index)
df.reset_index(inplace=True)
print(df)
# Using rename_axis() and reset_index()
df = pd.DataFrame(technologies,index=index)
df = df.rename_axis('index_column').reset_index()
print(df)
# Create new DataFrame with multi index
import pandas as pd
new_index = pd.MultiIndex.from_tuples([("Spark", 20000),
("PySpark", 25000),
("Hadoop", 26000),
("Python", 22000),
("pandas", 24000),
("Oracle",21000),
("Java", 23000)],
names=['Courses','Fee'])
data = {
'Duration':['30day','40days','35days','40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000],
'Address':["Mumbai","Delhi","Hyderabad","Bangalore","Chennai","Kolkata","Lucknow"]
}
df = pd.DataFrame(data, columns = ['Duration','Discount','Address'],
index=new_index)
print(df)
# convert multi_index
df.reset_index(inplace=True)
print(df)
# Using multiple columns to convert multi_index and level
df = pd.DataFrame(data,index=new_index)
df.reset_index(inplace=True, level=['Fee'])
print(df)
8. Conclusion
In this article, you have learned how to set or convert an index to a column in Pandas DataFrame using index()
, rename_axis()
, and reset_index()
function. Also, you have learned how to convert multiple indexes into columns by using these methods.
Happy Learning !!
Related Articles
- Pandas set_index() – Set Index to DataFrame
- Pandas Set Column as Index in DataFrame
- Pandas Set Value to Particular Cell in DataFrame Using Index
- Convert Pandas Index to List
- Pandas Index Explained with Examples
- Pandas Filter by Index
- Pandas DataFrame reindex() Function
- Pandas Drop Rows by Index
- Pandas Drop Index Column Explained
- Pandas DatetimeIndex Usage Explained
- Pandas Get Index from DataFrame?
- How to Get Index of Series in Pandas
- Convert Pandas DatetimeIndex to String
- How to GroupBy Index in Pandas?
- Pandas Get Column Name by Index or Position
- Working with MultiIndex in pandas DataFrame
- How to Rename Column by Index in pandas