• Post author:
  • Post category:Pandas
  • Post last modified:April 2, 2024
  • Reading time:18 mins read
You are currently viewing How to Slice Columns in Pandas DataFrame

Use DataFrame.loc[] and DataFrame.iloc[] to slice the columns in pandas DataFrame where loc[] is used with column labels/names and iloc[] is used with column index/position. You can also use these operators to select rows from Pandas DataFrame

In this article, I will explain how to slice/take or select a subset of a DataFrame by column labels, certain positions of the column, and by range e.t.c with examples.

Key Points –

  • Use the bracket notation with the column name to slice a single column.
  • Use the loc or iloc accessor to slice rows based on index labels or integer positions respectively, and specify the desired columns by name or index.
  • Utilize boolean indexing to slice rows based on conditions and select specific columns simultaneously.
  • Employ the slice object within iloc to slice both rows and columns simultaneously.
  • Take advantage of the loc and iloc accessors to slice columns based on labels or integer positions respectively, allowing for versatile column selection.

Quick Examples of Column-Slices of Pandas DataFrame

If you are in a hurry, below are some quick examples of how to take column slices of pandas DataFrame.


# Quick examples of column-slices 

# Example 1: Using loc[] to take column slices
# Slice selected multiple columns
df2=df.loc[:, ["Courses","Fee","Duration"]]

# Example 2: Slice random selected columns
df2=df.loc[:, ["Courses","Fee","Discount"]]

# Example 3: Slice columns by range
df2=df.loc[:,'Fee':'Discount']
df2=df.loc[:,'Duration':]
df2=df.loc[:,:'Duration']

# Example 4: Slice every alternate column
df2 = df.loc[:,::2]

# Example 5: Using iloc[] to take column slices
# Slice by selected column position
df2 = df.iloc[:,[1,3,4]]

# Example 6: Slice between indexes 1 and 4 (2,3,4)
df2 = df.iloc[:,1:4]

# Example 7: Slice From 3rd to end
df2 = df.iloc[:,2:]

# Example 8: Slice First Two Columns
df2 = df.iloc[:,:2]

Now, let’s create a DataFrame with a few rows and columns and execute some examples of how to slice columns in pandas. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


# Create DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark"],
    'Fee' :[20000,25000],
    'Duration':['30days','40days'],
    'Discount':[1000,2300]
              }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

Pandas DataFrame Slice Columns

Using Pandas.DataFrame.loc[] – Slice Columns by Names or Labels

By using pandas.DataFrame.loc[] you can slice columns by names or labels. To slice the columns, the syntax is df.loc[:,start:stop:step]; to slice columns by names or labels. where start is the name of the first column to include, stop is the name of the last column to include (exclusive), and step is the number of indices to advance after each extraction, allowing the selection of alternate columns; for instance, you can select alternatively, employ the syntax [:, [labels]], where the label is a list of column.


# loc[] syntax to slice columns
df.loc[:,start:stop:step]

Slice DataFrame Columns by Labels

To slice DataFrame columns by labels or names, you only need to provide the multiple labels you want to slice as a list. Here we use the list of labels instead of the start:stop:step approach.


# Slice Columns by labels
df1 = df.loc[:, ["Courses","Fee","Duration"]]
print("Get selection of columns by labels:\n", df1)

Yields below output.

Pandas DataFrame Slice Columns

Slice Certain Selective Columns in Pandas

If you want to select certain columns at random from a Pandas DataFrame, you can achieve this by passing the selected column names or labels as a list to the DataFrame indexing operator.


# Slice by Certain Columns
df.loc[:, ["Courses","Fee","Discount"]]

# Output:
#   Courses    Fee  Discount
# 0    Spark  20000      1000
# 1  PySpark  25000      2300

Slice DataFrame Columns by Range

When slicing a DataFrame by the range of columns in Pandas, you can specify the start and stop column names.

  • If you don’t provide a start column, it automatically loc[] selects columns from the beginning.
  • If you don’t provide a stop column, loc[] selects all columns from the start label.
  • By providing both start and stop column names, loc[] selects all columns in between, inclusive of both start and stop..

# Slice all columns between Fee an Discount columns
df2 = df.loc[:,'Fee':'Discount']

# Output
#     Fee Duration  Discount
# 0  20000   30days      1000
# 1  25000   40days      2300

# Slice start from 'Duration' column
df2 = df.loc[:,'Duration':]

# Output:
#   Duration  Discount
# 0   30days      1000
# 1   40days      2300

# Slice Start from beginning and end at 'Duration' column
df2 = df.loc[:,:'Duration']

# Output:
#   Courses    Fee Duration
# 0    Spark  20000   30days
# 1  PySpark  25000   40days

Select Every Alternate Column

Similarly, using loc[], to select every alternate column in a Pandas DataFrame, you can use Python’s slicing notation with a step size of 2.


# Slice every alternate column
df2 = df.loc[:,::2]

# Output:
#    Courses Duration
# 0    Spark   30days
# 1  PySpark   40days

In the above examples, you use the loc[] accessor to select all rows (:) and every other column (::2) starting from the first column. Then ehe resulting DataFrame df2 contains every other column from the original DataFrame df.

Pandas DataFrame.iloc[] – Column Slices by Index or Position

By using pandas.DataFrame.iloc[] you can slice DataFrame by column position/index. ; Remember index starts from 0. You can slice a DataFrame by column position/index using iloc[] with the syntax [:,start:stop:step] where start indicates of the first column to include, stop indicates of the last column to include (exclusive), and step indicates the number of indices to advance after each extraction, enabling selection of columns at regular intervals.

Alternatively, you can use the syntax df.iloc[:, [indices]] with indices as a list of column indices to include.

Slice Columns by Index Position

We are going to use columns by index position and retrieve slices of DataFrame. Below example retrieves "Fee","Discount" and "Duration" slices of column DataFrame.


# Slice by selected column position
df1 = df.iloc[:,[1,2,3]]
print("Get selection of columns by indexes:\n", df1)

Yields below output.

Pandas DataFrame Slice Columns

Column Slices by Position Range

Like slices by column labels, you can also slice a DataFrame by a range of positions. For instance, uses iloc[] to slice the DataFrame columns from index 1 to index 4 (exclusive). The colon : before the comma indicates that we’re selecting all rows, while 1:4 specifies the range of column indices to select. The resulting DataFrame will include columns at positions 1, 2, and 3.


# Slice between indexes 1 and 4 (1, 2, 3)
print(df.iloc[:,1:4])

# Output:
#     Fee Duration  Discount
# 0  20000   30days      1000
# 1  25000   40days      2300

# Slice From 3rd to end
print(df.iloc[:,2:])

# Output:
#   Duration  Discount
# 0   30days      1000
# 1   40days      2300

# Slice First Two Columns
print(df.iloc[:,:2])

# Output:
#   Courses    Fee
# 0    Spark  20000
# 1  PySpark  25000

To retrieves the last column use df.iloc[:,-1:] and to retrieves just the first column df.iloc[:,:1].

Complete Example


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark"],
    'Fee' :[20000,25000],
    'Duration':['30days','40days'],
    'Discount':[1000,2300],
    'Tutor':['Michel','Sam']
              }
df = pd.DataFrame(technologies)
print(df)

# Slice selected multiple columns
print(df.loc[:, ["Courses","Fee","Duration"]])

# Slice random selected columns
print(df.loc[:, ["Courses","Fee","Discount"]])

# Slice columns by range
print(df.loc[:,'Fee':'Discount']) 
print(df.loc[:,'Duration':])
print(df.loc[:,:'Duration'])

# Slice every alternate column
print(df.loc[:,::2])

# Slice by selected column position
print(df.iloc[:,[1,3,4]])

# Slice between indexes 1 and 4 (2,3,4)
print(df.iloc[:,1:4])

# Slice From 3rd to end
print(df.iloc[:,2:])

# Slice First Two Columns
print(df.iloc[:,:2])

FAQ on Slice Columns in Pandas DataFrame

How do I slice specific columns by name in a Pandas DataFrame?

You can slice specific columns by name using the loc[] accessor. Example: df.loc[:, ["column1", "column2", "column3"]]

How can I select columns based on their position or index in a DataFrame?

To select columns based on their position or index in a DataFrame, you can use the iloc[] accessor in pandas. For instance, df.iloc[:, 0:3] selects the first three columns.

How do I slice every other column in a DataFrame?

To slice every other column in a DataFrame, you can use slicing notation with a step size of 2. For instance, df.iloc[:, ::2] selects every alternate column.

What’s the syntax to select the last column in a DataFrame?

To select the last column in a DataFrame, you can use negative indexing with iloc[]. For examples, you can use df.iloc[:,-1:] to select the last column. The -1 index refers to the last column.

How do I select just the first column in a DataFrame?

You can use df.iloc[:,:1] to select just the first column. The :1 index specifies up to the first column (exclusive).

How do I slice columns based on a specific range of index positions?

To slice columns based on a specific range of index positions in a DataFrame, you can use slicing notation within the iloc[] accessor. For instance, df.iloc[:, 1:4] selects columns at index positions 1, 2, and 3.

Conclusion

In this article, you have learned how to take column-slices of pandas DataFrame using DataFrame.loc[], and DataFrame.iloc[] function with multiple approaches.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium