You can set pandas column as index by using DataFrame.set_index()
method and DataFrame.index
property. What is an Index in pandas? The row label of DataFrame is an Index. In this article, I will explain how to use a column as an index with some simple examples. By default, an index is created for DataFrame. But, you can set an Index value while creating a DataFrame or set a specific existing column of DataFrame as an index.
1. Quick Examples to Set DataFrame Column as Row Index
If you are in a hurry below are some quick examples.
# Below are some quick examples.
# Using set_index() method.
df.set_index('Fee', inplace=True)
# Using Set_index() by Transform() method.
df2=df.set_index('Courses').T
# Set Column as index.
df.index = df['Courses']
# Drop column after setting it as Index.
df2=df.drop('Courses', axis=1)
Now, let’s create a pandas DataFrame and execute these examples and validate results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create a Pandas DataFrame.
import pandas as pd
df = pd.DataFrame([["Spark",20000,'40days'],
["PySpark",22000,'30days'],
["Python",25000,'35days']],
columns=['Courses','Fee','Duration'])
print(df)
Yields below output. This DataFrame was created with a default index.
# Output:
Courses Fee Duration
0 Spark 20000 40days
1 PySpark 22000 30days
2 Python 25000 35days
2. Using set_index() Method
Use DataFrame.set_index()
method to set the existing column of DataFrame as an index. On DataFrame, the row label is an Index. If your DataFrame has already had an Index, this replaces the existing index or expands on it.
You can set the DataFrame index (row labels) using one or more existing columns or arrays (of the correct length).
Syntax:
# Syntax for set_index() method.
DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)
Usage with Example:
In the below example, I am setting Fee
column as an index.
# Using set_index() method.
df.set_index('Fee', inplace=True)
print(df)
Yields below output. If inplace=True
is not provided, set_index()
returns the modified DataFrame as a result. Using this approach, it automatically removes the column from the DataFrame after setting it as Index.
# Output:
Courses Duration
Fee
20000 Spark 40days
22000 PySpark 30days
25000 Python 35days
3. Using Set_index() by Transform() Method
Set_index()
method is also used to transpose the columns. There is a transpose (data.T
) method in pandas that will help you do it.
# Using Set_index() by Transform() method.
df2=df.set_index('Courses').T
print(df2)
Yields below output.
# Output:
Courses Spark PySpark Python
Duration 40days 30days 35days
4. Set Column as Index by DataFrame.index Property
You can set pandas column as index by using DataFrame.index
property. In order to use a comuln as index, just select the columns from DataFrame and assign it to the DataFrame.index property.
# Set Column as index.
df.index = df['Courses']
print(df)
Yields below output.
# Output:
Courses Duration
Courses
Spark Spark 40days
PySpark PySpark 30days
Python Python 35days
Note that in the above example, I am setting Courses as Index but still that column is present on DataFrame. Use DataFrame.drop() method to drop the column if you don’t want it.
DataFrame.drop()
function is used to drop specified labels from rows or columns. Remove rows or columns by specifying label names and corresponding axis, or by specifying directly index or column names.
# Using DataFrame.drop() method.
df2=df.drop('Courses', axis=1)
print(df2)
Yields below output.
# Output:
Duration
Courses
Spark 40days
PySpark 30days
Python 35days
5. Complete Examples of Set Pandas Column as Index
# Create a Pandas DataFrame.
import pandas as pd
df = pd.DataFrame([["Spark",20000,'40days'],["PySpark",22000,'30days'],["Python",25000,'35days']],
columns=['Courses','Fee','Duration'])
print(df)
# Using set_index() method.
df.set_index('Fee', inplace=True)
print(df)
# Using Set_index() by Transform() method.
df2=df.set_index('Courses').T
print(df2)
# Set Column as index.
df.index = df['Courses']
print(df)
# Using DataFrame.drop() method.
df2=df.drop('Courses', axis=1)
print(df2)
Conclusion
In this article, you have learned how to set pandas column as a row index using DataFrame.set_index()
method and DataFrame.index
property with simple examples.
Related Articles
- Change String Object to Date in Pandas DataFrame
- Count(Distinct) SQL Equivalent in Pandas DataFrame
- Convert Date (datetime) to String Format
- Pandas Filter DataFrame Rows on Dates
- Pandas Groupby Columns and Get Count
- Pandas Set Index to Column in DataFrame
- Pandas set_index() – Set Index to DataFrame
- Pandas Set Value to Particular Cell in DataFrame Using Index
- Pandas Set Index Name to DataFrame