If you have a multiple series and wanted to create a pandas DataFrame by appending each series as a columns to DataFrame, you can use concat()
method.
In pandas, Series is a one-dimensional labeled array capable of holding any data type(integers, strings, floating-point numbers, Python objects, etc.). Series stores data in sequential order. It is one-column information similar to a columns in an excel sheet/SQL table.
When you combine multiple pandas Series into a DataFrame, it creates a DataFrame with the number of columns equivalent to number of series you are merging.
1. Create pandas DataFrame From Multiple Series
You can create a DataFrame from multiple Series objects by adding each series as a columns.
By using concat()
method you can merge multiple series together into DataFrame. This takes several params, for our scenario we use list
that takes series to combine and axis=1
to specify merge series as columns instead of rows. Note that using axis=0
appends series to rows instead of columns.
import pandas as pd
# Create pandas Series
courses = pd.Series(["Spark","PySpark","Hadoop"])
fees = pd.Series([22000,25000,23000])
discount = pd.Series([1000,2300,1000])
# Combine two series.
df=pd.concat([courses,fees],axis=1)
# It also supports to combine multiple series.
df=pd.concat([courses,fees,discount],axis=1)
print("Create pandas Series:\n",df)
Yields below output.
It assigns numbers to columns. you can assign names to Series to use it as columns.
# Create Series by assigning names
courses = pd.Series(["Spark","PySpark","Hadoop"], name='courses')
fees = pd.Series([22000,25000,23000], name='fees')
discount = pd.Series([1000,2300,1000],name='discount')
df=pd.concat([courses,fees,discount],axis=1)
print(df)
Yields below output.
# Output:
courses fees discount
0 Spark 22000 1000
1 PySpark 25000 2300
2 Hadoop 23000 1000
Let’s see how to assign an index to Series and provide custom column names to the DataFrame.
# Assign Index to Series
index_labels=['r1','r2','r3']
courses.index = index_labels
fees.index = index_labels
discount.index = index_labels
# Concat Series by Changing Names
df=pd.concat({'Courses': courses,
'Course_Fee': fees,
'Course_Discount': discount},axis=1)
print(df)
Yields below output.
# Output:
Courses Course_Fee Course_Discount
r1 Spark 22000 1000
r2 PySpark 25000 2300
r3 Hadoop 23000 1000
Finally, let’s see how to rest the index using reset_index()
method. This moves the current index as a column and adds a new index to a combined DataFrame.
# Change the index to a column & create new index
df = df.reset_index()
print(df)
Yields below output.
# Output:
index Courses Course_Fee Course_Discount
0 r1 Spark 22000 1000
1 r2 PySpark 25000 2300
2 r3 Hadoop 23000 1000
Frequently Asked Questions on Create DataFrame From Multiple Series
In Pandas, a DataFrame is a two-dimensional, tabular data structure with labeled axes (rows and columns). It is similar to a spreadsheet or SQL table and is a powerful tool for data manipulation and analysis.
To create a DataFrame from multiple Series in Pandas, you can use the pd.DataFrame
constructor.
The Series used to create a DataFrame must have the same length. If the Series have different lengths, it will result in a ValueError
. Each Series will be treated as a column in the DataFrame, and they must align in length to form a coherent tabular structure.
You can add more Series to an existing DataFrame by specifying a new column name. For example, a new Series (new_series
) is created, and then it is added to the existing DataFrame (df
) using square bracket notation. The new column is labeled ‘Column3’, and the data from the new_series
is assigned to this column. The resulting DataFrame will have three columns: ‘Column1’, ‘Column2’, and ‘Column3’.
You can specify custom column names when creating a DataFrame from multiple Series. Instead of using the default names, you can provide your own column names in the dictionary passed to the pd.DataFrame
constructor.
You can set the index for the DataFrame when creating it from multiple Series using the index
parameter in the pd.DataFrame
constructor. For example, the index
parameter is set to a list of custom index labels (['row1', 'row2', 'row3']
). The resulting DataFrame will have the specified index instead of the default integer index.
Conclusion
In this article, you have learned how to create a DataFrame from multiple pandas Series objects. On DataFrame each series becomes a column. Also learned how to change the column names while creating a DataFrame and reset indexes.
Happy Learning !!
Related Articles
- How To Get Value From Pandas Series?
- Find Intersection Between Two Series in Pandas?
- Pandas Insert List into Cell of DataFrame
- pandas Add New Column to DataFrame
- Pandas Get Floor or Ceil of Series
- pandas Rename DataFrame Columns
- Pandas Create DataFrame From Dict (Dictionary)
- Pandas Create Conditional Column in DataFrame
- What is a Pandas Series Explained With Examples
- Pandas DataFrame – Different Ways to Iterate Over Rows
- How to Combine Two Columns of Text in Pandas DataFrame
- Pandas Create New DataFrame By Selecting Specific Columns