To create a new DataFrame by selecting specific columns from an existing DataFrame in Pandas, you can use the DataFrame.copy()
, DataFrame.filter()
, DataFrame.transpose()
, DataFrame.assign()
functions. DataFrame.iloc[]
and DataFrame.loc[]
are also used to select columns. In this article, I will explain how to select a single column or multiple columns to create new pandas DataFrame with detailed examples.
Quick Examples to Create New DataFrame by Selecting Specific Columns
Following are quick examples of creating a new DataFrame by selecting specific columns.
# Quick examples to create new dataframe
# Using DataFrame.copy() create new daraframe
df2 = df[['Courses', 'Fee']].copy()
# Using DataFrame.filter() method
df2 = df.filter(['Courses','Fee'], axis=1)
# Using DataFrame.transpose() method
df2 = pd.DataFrame([df.Courses, df.Fee]).transpose()
# Using DataFrame.iloc[]
# Create new DataFrame by df.copy()
df2 = df.iloc[: , [1, 2]].copy()
# Using DataFrame.loc[] create new DataFrame by specific column
df2=df.loc[:, df.columns.drop(['Courses', 'Discount'])]
# Create new dataframe of Specific column by DataFrame.assign() method
df2 = pd.DataFrame().assign(Courses=df['Courses'], Duration=df['Duration'])
# Create new pandas DataFrame
df2 = df[['Courses','Fee']]
To run some examples of creating a new Pandas DataFrame by selecting specific columns, let’s create a Pandas DataFrame using data from a dictionary.
# Create a Pandas DataFrame.
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000 30days 1000
1 PySpark 25000 40days 2300
2 Python 22000 35days 1200
3 pandas 30000 50days 2000
Using DataFrame.copy() Create New DataFrame
Pandas.DataFrame.copy()
function returns a copy of the DataFrame. Select the columns from the original DataFrame and copy it to create a new DataFrame using copy()
function.
# Using DataFrame.copy() create new DaraFrame.
df2 = df[['Courses', 'Fee']].copy()
print(df2)
Yields below output.
# Output:
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 30000
Alternatively, You can also use DataFrame.filter()
method to create a copy and create a new DataFrame by selecting specific columns.
# Using DataFrame.filter() method.
df2 = df.filter(['Courses','Fee'], axis=1)
print(df2)
Yields output same as above.
Using DataFrame.transpose() Method
DataFrame.transpose()
method is used to transpose index and column. It reflects the DataFrame writing rows as columns and vice-versa. Use df.columnname
to select the column as a Series and pass all these column names you wanted to a constructor to create a DataFrame.
# Using DataFrame.transpose() Method.
df2 = pd.DataFrame([df.Courses, df.Fee]).transpose()
print(df2)
Yields below output.
# Output:
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 30000
Using DataFrame.iloc[] Create New DataFrame by DataFrame.copy()
The DataFrame
.iloc[]
property gets or sets, the values of the specified index. The df.iloc[]
specify both row and column with an index.
# Using DataFrame.iloc[] create new DataFrame by df.copy().
df2 = df.iloc[: , [1, 2]].copy()
print(df2)
Yields below output.
# Output:
Fee Duration
0 20000 30days
1 25000 40days
2 22000 35days
3 30000 50days
Using DataFrame.loc[] Create New DataFrame by Specific Column
DataFrame.loc[]
property is used to access a group of rows and columns by label(s) or a boolean array. The .loc[]
property may also be used with a boolean array. In the below example use drop() function to drop the unwanted columns from pandas DataFrame.
# Using DataFrame.loc[] create new DataFrame by specific column.
df2=df.loc[:, df.columns.drop(['Courses', 'Discount'])]
print(df2)
Yields below output.
# Output:
Fee Duration
0 20000 30days
1 25000 40days
2 22000 35days
3 30000 50days
Create New DataFrame of Specific Column by DataFrame.assign()
You can create a new DataFrame of a specific column by using DataFrame.assign()
method. The assign()
method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones.
# Create New DataFrame of Specific column by DataFrame.assign() method.
df2 = pd.DataFrame().assign(Courses=df['Courses'], Duration=df['Duration'])
print(df2)
Yields below output.
# Output:
Courses Duration
0 Spark 30days
1 PySpark 40days
2 Python 35days
3 pandas 50days
Other Example
Another simple way to create new pandas DataFrame of selected columns.
# Create new pandas DataFrame.
df2 = df[['Courses','Fee']]
print(df2)
Yields below output.
# Output:
Courses Fee
0 Spark 20000
1 PySpark 25000
2 Python 22000
3 pandas 30000
Complete Examples To Create New Pandas DataFrame of Specified Column
Below are the complete examples to create new pandas DataFrame by selecting specific column.
# Create a Pandas DataFrame.
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
df = pd.DataFrame(technologies)
print(df)
# Using DataFrame.copy() create new DaraFrame.
df2 = df[['Courses', 'Fee']].copy()
print(df2)
# Using DataFrame.filter() method.
df2 = df.filter(['Courses','Fee'], axis=1)
print(df2)
# Using DataFrame.transpose() Method.
df2 = pd.DataFrame([df.Courses, df.Fee]).transpose()
print(df2)
# Using DataFrame.iloc[] create new DataFrame by df.copy().
df2 = df.iloc[: , [1, 2]].copy()
print(df2)
# Using DataFrame.loc[] create new DataFrame by specific column.
df2=df.loc[:, df.columns.drop(['Courses', 'Discount'])]
print(df2)
# Create New DataFrame of Specific column by DataFrame.assign() method.
df2 = pd.DataFrame().assign(Courses=df['Courses'], Duration=df['Duration'])
print(df2)
# Create new pandas DataFrame.
df2 = df[['Courses','Fee']]
print(df2)
Conclusion
In this article, I have explained create a new Pandas DataFrame by selecting specific columns using various functions such as DataFrame.copy()
, DataFrame.filter()
, DataFrame.transpose()
, and DataFrame.assign()
. Additionally, we explored using the DataFrame.iloc[]
and DataFrame.loc[]
properties for selecting single or multiple columns from a Pandas DataFrame.
Related Articles
- Add Constant Column to Pandas DataFrame
- Sum Pandas DataFrame Columns With Examples
- Create Pandas DataFrame With Working Examples
- Select Pandas DataFrame Rows Between Two Dates
- Pandas Convert String Column To DateTime
- Pandas Create DataFrame From List
- Pandas Create Empty DataFrame
- Pandas Create DataFrame From Dict (Dictionary)