How to get an index from Pandas DataFrame? DataFrame.index
property is used to get the index from the DataFrame. Pandas Index is an immutable sequence used for indexing DataFrame and Series. The DataFrame index is also referred to as the row index, by default index is created on DataFrame as a sequence number that starts from 0 and increments by 1. You can also assign custom values to the Index.
Using the index we can select the rows from the given DataFrame or add the row at a specified Index. we can also get the index itself of the given DataFrame by using the .index property
. In this article, I will explain the index
property and using this property how we can get an index of DataFrame and also explain how to get the index as a list object using index.values.
Key Points –
- The
.index
attribute provides direct access to the DataFrame index. - Use
.index.dtype
to find out the data type of the index. - Use
tolist()
on.index
to convert the index to a Python list. - Use
.index[position]
to get a specific index value by position. - Use
.get_loc(value)
on.index
to find the position of a specific index value. - Use
in
keyword (e.g.,value in df.index
) to verify if a value exists in the index.
Quick Examples of Getting Index from Pandas DataFrame
If you are in a hurry, below are some quick examples of how to get an index from DataFrame.
# Quick examples of getting index from pandas DataFrame
# Example 1: Get the index
# Use df.index property
print(df.index)
# Example 2: Get the index
# Use index.values
print(list(df.index.values))
# Example 3: Get the index
# Use tolist()
print(list(df.index.values.tolist()))
# Example 4: Get the column index
# Using get_loc()
print(df.columns.get_loc('Fee'))
# Example 5: Get the index values
# Using np.where()
print(list(np.where(df["Discount"] > 1200)))
Get Index from Pandas DataFrame
Let’s create a Pandas DataFrame with a dictionary of lists, pandas DataFrame columns names Courses
, Fee
, Duration
, Discount
.
# Create DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
Yields below output.
You can get the Index from the pandas DataFrame by using .index
property, this index property returns the Series object. Let’s create DataFrame using data from the Python dictionary then call the index property on DataFrame to get the index. When we call index property with no specified index, it will return the complete index.
# Get the index of DataFrame
print("Get the index of DataFrame:\n", df.index)
Yields below output.
By default it returns the type of Index, since we have a range index it returned RangeIndex()
. Using any looping (Python for loop) technique we can access individual indexes of a given DataFrame in Python.
# Get the index of Dataframe use for loop
for i in df.index:
print(i)
# Output:
# 0
# 1
# 2
# 3
Get Pandas Index as a List
Sometimes you may be required to get the pandas DataFrame index as a list, we can do this by using df.index.values
. Let’s pass this into a list, it will return the index as a list.
# Get the index use index.values
print(list(df.index.values))
# Output:
# [0, 1, 2, 3]
Get Pandas Index Using tolist()
Alternatively using the Pandas tolist() function we can return the index of DataFrame as a list. For example,
# Get the index as List using tolist()
print(df.index.values.tolist())
# Output:
# [0, 1, 2, 3]
Get Column Index Using the get_loc()
From the above, we came to know how to retrieve the row index of DataFrame. However, we can also get the index of the DataFrame column using the get_loc()
function. For that, we have to pass the column label that we want to get its index to the get_loc()
function. It will return the index location.
# Get the column index use get_loc()
print(df.columns.get_loc('Fee'))
# Output:
# 1
Get Row Index Using the Numpy Where() Function
We can also get the index by specifying a condition passed into numpy.where() function. Let’s use the NumPy library to use its functions.
# Get the index values using np.where()
print(list(np.where(df["Discount"] > 1200)))
# Output:
# [array([1, 3], dtype=int64)]
Frequently Asked Questions on Get the Index Of DataFrame
You can retrieve the index of a DataFrame using the index
attribute. For example, df_index = df.index
You can convert the index to a list using the tolist()
method. For example, list_index = df.index.tolist()
You can access a specific index value by its position using the iloc[]
indexer. For example, first_index_value = df.index[0]
If you want to reset the index and create a new default integer index, you can use the reset_index()
method. For example, df_reset = df.reset_index()
You can set a specific column as the index using the set_index()
method. For example, df_set_index = df.set_index('column_name')
Conclusion
In this article, I have explained how to get the index of Pandas DataFrame by using the .index property
, index.values
, tolist()
function, and NumPy where() function with well-defined examples.
Happy Learning !!
Related Articles
- Pandas Get Column Name by Index or Position
- Pandas.Index.drop_duplicates() Explained
- How to Rename Column by Index in Pandas
- Pandas set index name to DataFrame
- Convert Pandas Index to List
- Pandas Set Index to Column in DataFrame
- Pandas – How to Change the Position of a Column
- Pandas Select Rows by Index (Position/Label)