Pandas Empty DataFrame with Column Names & Types

  • Post author:
  • Post category:Pandas
  • Post last modified:August 10, 2022

Sometimes you would be required to create an empty DataFrame with column names and specific types in pandas, In this article, I will explain how to do this with several examples. In my last article, I have explained Different ways to create pandas DataFrame.

1. Quick Examples

If you are in hurry, below are quick examples.


#Create empty DataFrame with specific column names & types
df = pd.DataFrame({'Courses': pd.Series(dtype='str'),
                   'Fee': pd.Series(dtype='int'),
                   'Duration': pd.Series(dtype='str'),
                   'Discount': pd.Series(dtype='float')})
# Using NumPy
dtypes = np.dtype(
    [
        ("Courses", str),
        ("Fee", int),
        ("Duration", str),
        ("Discount", float),
        ('date',np.datetime64)
    ]
)
df = pd.DataFrame(np.empty(0, dtype=dtypes))

2. Pandas Empty DataFrame with Column Names & Types

You can assign column names and data types to an empty DataFrame in pandas at the time of creation or updating on the existing DataFrame.

Note that when you create an empty pandas DataFrame with columns, by default it creates all column types as String/object.


import pandas as pd

# Create empty DataFrame
df = pd.DataFrame(columns = ["Courses", "Fee", "Duration","Discount"])
print(df)

# Outputs
#Empty DataFrame
#Columns: [Courses, Fee, Duration, Discount]
#Index: []

print(df.dtypes)

# Outputs
#Courses     object
#Fee         object
#Duration    object
#Discount    object
#dtype: object

To assign column types to DataFrame, use the below example where the dict key with column names and value with the type. In the below example I have used Fee as int, and Discount as float type, and the rest are string. Note that in pandas strings are represented as an object type.


#Create empty DataFrame with specific column types
df = pd.DataFrame({'Courses': pd.Series(dtype='str'),
                   'Fee': pd.Series(dtype='int'),
                   'Duration': pd.Series(dtype='str'),
                   'Discount': pd.Series(dtype='float')})
print(df.dtypes)
# Outputs
#Courses      object
#Fee           int32
#Duration     object
#Discount    float64
#dtype: object

3. Using Numpy

If you are using numpy, use the below approach to assign an empty DataFrame with column names and types. NumPy is a Python library for scientific computing and provides a multidimensional array object. At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance.


import pandas as pd
import numpy as np
dtypes = np.dtype(
    [
        ("Courses", str),
        ("Fee", int),
        ("Duration", str),
        ("Discount", float),
        ('date',np.datetime64)
    ]
)
df = pd.DataFrame(np.empty(0, dtype=dtypes))
print(df.dtypes)

This yields the same output as above.

Conclusion

In summary, you have learned how to get a DataFrame with column names and specific data types. If you have not assigned the types, by default pandas assign objects to all columns in DataFrame.

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing Pandas Empty DataFrame with Column Names & Types