• Post author:
  • Post category:Pandas
  • Post last modified:May 28, 2024
  • Reading time:10 mins read
You are currently viewing Pandas Create New DataFrame By Selecting Specific Columns

To create a new DataFrame by selecting specific columns from an existing DataFrame in Pandas, you can use the DataFrame.copy(), DataFrame.filter(), DataFrame.transpose(), DataFrame.assign() functions. DataFrame.iloc[] and DataFrame.loc[] are also used to select columns. In this article, I will explain how to select a single column or multiple columns to create new pandas DataFrame with detailed examples.

Advertisements

Quick Examples to Create New DataFrame by Selecting Specific Columns

Following are quick examples of creating a new DataFrame by selecting specific columns.


# Quick examples to create new dataframe

# Using DataFrame.copy() create new daraframe
df2 = df[['Courses', 'Fee']].copy()

# Using DataFrame.filter() method
df2 = df.filter(['Courses','Fee'], axis=1)

# Using DataFrame.transpose() method
df2 = pd.DataFrame([df.Courses, df.Fee]).transpose()

# Using DataFrame.iloc[] 
# Create new DataFrame by df.copy()
df2 = df.iloc[: , [1, 2]].copy()

# Using DataFrame.loc[] create new DataFrame by specific column
df2=df.loc[:, df.columns.drop(['Courses', 'Discount'])]

# Create new dataframe of Specific column by DataFrame.assign() method
df2 = pd.DataFrame().assign(Courses=df['Courses'], Duration=df['Duration'])

# Create new pandas DataFrame
df2 = df[['Courses','Fee']]

To run some examples of creating a new Pandas DataFrame by selecting specific columns, let’s create a Pandas DataFrame using data from a dictionary.


# Create a Pandas DataFrame.
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
0    Spark  20000   30days      1000
1  PySpark  25000   40days      2300
2   Python  22000   35days      1200
3   pandas  30000   50days      2000

Using DataFrame.copy() Create New DataFrame

Pandas.DataFrame.copy() function returns a copy of the DataFrame. Select the columns from the original DataFrame and copy it to create a new DataFrame using copy() function.


# Using DataFrame.copy() create new DaraFrame.
df2 = df[['Courses', 'Fee']].copy()
print(df2)

Yields below output.


# Output:
   Courses    Fee
0    Spark  20000
1  PySpark  25000
2   Python  22000
3   pandas  30000

Alternatively, You can also use DataFrame.filter() method to create a copy and create a new DataFrame by selecting specific columns.


# Using DataFrame.filter() method.
df2 = df.filter(['Courses','Fee'], axis=1)
print(df2)

Yields output same as above.

Using DataFrame.transpose() Method

DataFrame.transpose() method is used to transpose index and column. It reflects the DataFrame writing rows as columns and vice-versa. Use df.columnname to select the column as a Series and pass all these column names you wanted to a constructor to create a DataFrame.


# Using DataFrame.transpose() Method.
df2 = pd.DataFrame([df.Courses, df.Fee]).transpose()
print(df2)

Yields below output.


# Output:
   Courses    Fee
0    Spark  20000
1  PySpark  25000
2   Python  22000
3   pandas  30000

Using DataFrame.iloc[] Create New DataFrame by DataFrame.copy()

The DataFrame.iloc[] property gets or sets, the values of the specified index. The df.iloc[] specify both row and column with an index.


# Using DataFrame.iloc[] create new DataFrame by df.copy().
df2 = df.iloc[: , [1, 2]].copy()
print(df2)

Yields below output.


# Output:
     Fee Duration
0  20000   30days
1  25000   40days
2  22000   35days
3  30000   50days

Using DataFrame.loc[] Create New DataFrame by Specific Column

DataFrame.loc[] property is used to access a group of rows and columns by label(s) or a boolean array. The .loc[] property may also be used with a boolean array. In the below example use drop() function to drop the unwanted columns from pandas DataFrame.


# Using DataFrame.loc[] create new DataFrame by specific column.
df2=df.loc[:, df.columns.drop(['Courses', 'Discount'])]
print(df2)

Yields below output.


# Output:
     Fee Duration
0  20000   30days
1  25000   40days
2  22000   35days
3  30000   50days

Create New DataFrame of Specific Column by DataFrame.assign()

You can create a new DataFrame of a specific column by using DataFrame.assign() method. The assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.


# Create New DataFrame of Specific column by DataFrame.assign() method.
df2 = pd.DataFrame().assign(Courses=df['Courses'], Duration=df['Duration'])
print(df2)

Yields below output.


# Output:
   Courses Duration
0    Spark   30days
1  PySpark   40days
2   Python   35days
3   pandas   50days

Other Example

Another simple way to create new pandas DataFrame of selected columns.


# Create new pandas DataFrame.
df2 = df[['Courses','Fee']]
print(df2)

Yields below output.


# Output:
   Courses    Fee
0    Spark  20000
1  PySpark  25000
2   Python  22000
3   pandas  30000

Complete Examples To Create New Pandas DataFrame of Specified Column

Below are the complete examples to create new pandas DataFrame by selecting specific column.


# Create a Pandas DataFrame.
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print(df)

# Using DataFrame.copy() create new DaraFrame.
df2 = df[['Courses', 'Fee']].copy()
print(df2)

# Using DataFrame.filter() method.
df2 = df.filter(['Courses','Fee'], axis=1)
print(df2)

# Using DataFrame.transpose() Method.
df2 = pd.DataFrame([df.Courses, df.Fee]).transpose()
print(df2)

# Using DataFrame.iloc[] create new DataFrame by df.copy().
df2 = df.iloc[: , [1, 2]].copy()
print(df2)

# Using DataFrame.loc[] create new DataFrame by specific column.
df2=df.loc[:, df.columns.drop(['Courses', 'Discount'])]
print(df2)

# Create New DataFrame of Specific column by DataFrame.assign() method.
df2 = pd.DataFrame().assign(Courses=df['Courses'], Duration=df['Duration'])
print(df2)

# Create new pandas DataFrame.
df2 = df[['Courses','Fee']]
print(df2)

Conclusion

In this article, I have explained create a new Pandas DataFrame by selecting specific columns using various functions such as DataFrame.copy(), DataFrame.filter(), DataFrame.transpose(), and DataFrame.assign(). Additionally, we explored using the DataFrame.iloc[] and DataFrame.loc[] properties for selecting single or multiple columns from a Pandas DataFrame.

References

Leave a Reply