How to get an index from Pandas DataFrame? DataFrame.index
property is used to get the index from the DataFrame. Pandas Index is an immutable sequence used for indexing DataFrame and Series. The DataFrame index is also referred to as the row index, by default index is created on DataFrame as a sequence number that starts from 0 and increments by 1. You can also assign custom values to Index.
Using the index we can select the rows from the given DataFrame or add the row at a specified Index. we can also get the index itself of the given DataFrame by using the .index property
. In this article, I will explain the index
property and using this property how we can get an index of DataFrame and also explain how to get the index as a list object using index.values.
1. Quick Examples of getting Index from Pandas DataFrame
If you are in a hurry, below are some quick examples of how to get an index from DataFrame.
# Below are quick example
# Example 1: Get the index use df.index property
print(df.index)
# Example 2: Get the index use index.values
print(list(df.index.values))
# Example 3: Get the index use tolist()
print(list(df.index.values.tolist()))
# Example 4: Get the column index use get_loc()
print(df.columns.get_loc('Fee'))
# Example 5: Get the index values using np.where()
print(list(np.where(df["Discount"] > 1200)))
2. Get Index from Pandas DataFrame
You can get the Index from the pandas DataFrame by using .index
property, this index property returns Series object. Let’s create DataFrame using data from the Python dictionary then call the index property on DataFrame to get the index. When we call index property with no specified index, it will return the complete index.
# Create DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
df = pd.DataFrame(technologies)
print(df)
# Get the index of DataFrame
print(df.index)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000 30days 1000
1 PySpark 25000 40days 2300
2 Python 22000 35days 1200
3 pandas 30000 50days 2000
RangeIndex(start=0, stop=4, step=1)
By default it returns the type of Index, since we have a range index it returned RangeIndex(). Using any looping (Python for loop) technique we can access individual indexes of a given DataFrame in Python.
# Get the index of Dataframe use for loop
for i in df.index:
print(i)
# Output:
# 0
# 1
# 2
# 3
3. Get Pandas Index as a List
Sometimes you may be required to get the pandas DataFrame index as a list, we can do this by using df.index.values
. Let’s pass tthis into a list, it will return the index as a list.
# Get the index use index.values
print(list(df.index.values))
# Output:
# [0, 1, 2, 3]
4. Get Pandas Index using tolist()
Alternatively using the Pandas tolist() function we can return the index of DataFrame as a list. For example,
# Get the index as List using tolist()
print(df.index.values.tolist())
# Output:
# [0, 1, 2, 3]
5. Get Column Index Using the get_loc()
From the above, we came to know how to retrieve the row index of DataFrame. However, we can also get the index of DataFrame column using the get_loc() function. For that, we have to pass the column label which we want to get its index to the get_loc() function. It will return the index location.
# Get the column index use get_loc()
print(df.columns.get_loc('Fee'))
# Output:
# 1
6. Get Row Index Using the Numpy Where() Function:
We can also get the index by specifying a condition passed into numpy.where() function. Let’s use NumPy library to use its functions.
# Get the index values using np.where()
print(list(np.where(df["Discount"] > 1200)))
# Output:
# [array([1, 3], dtype=int64)]
7. Conclusion
In this article, I have explained how to get the index of Pandas DataFrame by using the .index property
, index.values
, tolist(),
function, and NumPy where() function with well-defined examples.
Happy Learning !!