• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:14 mins read
You are currently viewing Pandas – Get Column Index For Column Name

In Pandas, you can get the column index for a specific column name using the get_loc() method of the DataFrame. DataFrame.columns return all column labels of DataFrame as an Index and get_loc() is a method of Index that gives you a column Index for a given column. In this article, I will explain different ways to get an index from column names with examples.

1. Quick Examples of Column Index From Column Name

If you are in a hurry, below are some quick examples of how to get the column index from the column name in Pandas DataFrame.


# Quick examples of column index from column name

# Example 1: Get column index 
# From column name i.e column 3
idx=df.columns.get_loc("Duration")
print("Column Index : "+ str(idx))

# Example 2: Dictionary of column name 
# With associated index
idx_dic = {}
for col in df.columns:
    idx_dic[col] = df.columns.get_loc(col)
print(idx_dic)

# Example 3: Get index for multiple column labels/names
query_cols=['Fee','Courses']
cols_index = [df.columns.get_loc(col) for col in query_cols]
print(cols_index)

# Example 4: Column index from column name 
# Using get_indexer().
cols_index = df.columns.get_indexer(query_cols)

Now, let’s create a Pandas DataFrame with a few duplicate rows on all columns. Our DataFrame contains column names CoursesFeeDuration, and Discount.


# Create DataFrame
import pandas as pd
technologies = {
        'Courses':["Spark","PySpark","Python","pandas"],
        'Fee' :[20000,25000,22000,30000],
        'Duration':['30days','40days','35days','50days'],
        'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n",df)

Yields below output.

pandas column index

2. Get Column Index From Column Name by get_loc()

DataFrame.columns return all column labels of DataFrame as an Index and Index.get_loc() returns a column Index for a given column.

2.1 Syntax of Index.get_loc()

Following is the syntax of index.get_loc()


# Syntax for index.get_loc method.
Index.get_loc(key, method=None, tolerance=None)

In the below example, use the get_loc() method on the columns attribute of the DataFrame (df). It takes the column name “Duration” as an argument and returns the index of that column. The resulting index is then assigned to the variable idx.

The str(idx) is used to convert the index to a string for concatenation with the rest of the print statement.


# Get column index from column name i.e column 3.
idx=df.columns.get_loc("Duration")
print("Column Index : "+ str(idx))

Yields below output.

pandas column index

3. Using Dictionary of Column Name With Associated Index

You can see if we want to create a dictionary with column name as key and associated index as value by idx_dic[] method. For example-


# Dictionary of Column name with associated index.
idx_dic = {}
for col in df.columns:
    idx_dic[col] = df.columns.get_loc(col)
print(idx_dic)

Yields below output.


# Output:
{'Courses': 0, 'Fee': 1, 'Duration': 2, 'Discount': 3}

4. Get Index for Multiple Column Labels/Names

Using the same get_loc() you can get the Index for multiple column labels/names in DataFrame by passing column labels as a list to this method.

To get the indices for multiple-column labels or names. It uses a list comprehension to iterate through the specified columns (query_cols) and retrieves their indices using the get_loc() method.


# Get Index for Multiple Column Labels/Names
query_cols=['Fee','Courses']
cols_index = [df.columns.get_loc(col) for col in query_cols]
print(cols_index)

# Output:
# cols_index : [1,0]

5. Get Column Index From Column Name Using get_indexer()

In Pandas, you can use the get_indexer() method to get the indices for multiple column names efficiently. The get_indexer() method returns an indexer array that can be used to index into an array or list-like structure.

In the below example, query_cols is a list of column names for which you want to get the indices. The get_indexer() method is then used to obtain the indices efficiently.


# Column index from column name 
# Using get_indexer()
query_cols=['Fee','Courses']
cols_index = df.columns.get_indexer(query_cols)
print(cols_index)

# Output:
# [1 0]

6. Complete Examples


# Get Column Index From Column Name in Pandas
import pandas as pd
technologies = {
        'Courses':["Spark","PySpark","Python","pandas"],
        'Fee' :[20000,25000,22000,30000],
        'Duration':['30days','40days','35days','50days'],
        'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print(df)

# Get column index from column name i.e column 3.
idx=df.columns.get_loc("Duration")
print("Column Index : "+ str(idx))

# Dictionary of Column name with associated index.
idx_dic = {}
for col in df.columns:
    idx_dic[col] = df.columns.get_loc(col)
print(idx_dic)

# Get Index for Multiple Column Labels/Names
query_cols=['Fee','Courses']
cols_index = [df.columns.get_loc(col) for col in query_cols]
print(cols_index)

# Column index from column name using get_indexer().
cols_index = df.columns.get_indexer(query_cols)

Frequently Asked Questions on Get Column Index For Column Name

How can I get the column index for a specific column name in a Pandas DataFrame?

You can get the column index for a specific column name in a Pandas DataFrame using the get_loc() method. For example, df.columns.get_loc(column_name) returns the index of the column with the specified name (‘B’ in this case). The result is then printed, indicating the index of the column within the DataFrame.

Can I get the index for multiple column names at once?

You can get the indices for multiple column names at once. One way to achieve this is by using a list of column names and a list comprehension. For example, the get_loc() method is applied for each column name in the query_cols list, and the resulting indices are stored in the cols_index list. The output will show the indices corresponding to the specified columns.

Is there an alternative method to get indices for multiple columns?

An alternative method to get indices for multiple columns is to use the get_indexer() method. This method efficiently returns an array of indices for a list of column names.

How can I create a dictionary mapping column names to their indices?

You can create a dictionary mapping column names to their indices by iterating through the columns of the DataFrame and using the get_loc() method for each column.

Conclusion

In this article, you have learned how to get column Index from a column name by using get_loc(), and get_indexer(). To get the index for multiple column names pass columns as a list to get_loc() method.

Reference

  

Leave a Reply