• Post author:
• Post category:Pandas

To convert a NumPy array to a Pandas DataFrame, you can use the pd.DataFrame constructor provided by the Pandas library. We can convert the Numpy array to Pandas DataFrame by using various syntaxes. In this article, I will explain how to convert a numpy array to Pandas DataFrame with examples.

## 1. Quick Examples to Convert NumPy Array to DataFrame

If you are in a hurry, below are some quick examples of how to convert the NumPy array to DataFrame.

# Quick examples to convert numpy array to dataframe

# Example 1: Convert 2-dimensional NumPy array
array = np.array([['Spark', 20000, 1000], ['PySpark', 25000, 2300], ['Python', 22000, 12000]])
df = pd.DataFrame({'Course': array[:, 0], 'Fee': array[:, 1], 'Discount': array[:, 2]})

# Example 2: Convert array to DataFrame
# Using from_records()
array = np.arange(6).reshape(2, 3)
df = pd.DataFrame.from_records(array)

# Example 4: Convert array to DataFrame
arr1  = np.arange(start = 1, stop = 10, step = 1).reshape(-1)
arr2 = np.random.rand(9).reshape(-1)
df = pd.DataFrame()
df['col1'] = arr1
df['col2'] = arr2

# Example 5 : Convert using transpose() function
array = np.array([['Courses', 'Fee'], ['Spark', 'PySpark'], [20000, 25000]])
df = pd.DataFrame(i for i in array).transpose()
df.drop(0, axis=1, inplace=True)
df.columns = array[0]

## 2. What is Numpy Array?

We know that a NumPy array is a data structure (usually numbers) that holds values of the same type, similar to a list. But arrays are more efficient than Python lists and also much more compact. In case you want the data in an array you should able to convert Series to array or Pandas DataFrame to a Numpy array

Since our article is to convert NumPy Assay to DataFrame, Let’s Create NumPy array using np.array() function and then convert it to DataFrame.

import numpy as np

# Create a 2 dimensional numpy array
array = np.array([['Spark', 20000, 1000], ['PySpark', 25000, 2300], ['Python', 22000, 1200]])
print("Create numpy array:\n",array)
print(type(array))

Yields below output.

## 3. What is DataFrame?

pandas DataFrame is a Two-Dimensional data structure, an immutable, heterogeneous tabular data structure with labeled axes rows, and columns. pandas Dataframe consists of three components principal, data, rows, and columns. Pandas is built on the NumPy library and written in languages like PythonCython, and C.

## 4. Convert NumPy Array to Pandas DataFrame

Let’s convert the above NumPy array to DataFrame using the below syntax. In this syntax, I am using NumPy slicing technique for providing column elements to DataFrame.

# Convert array to DataFrame
df = pd.DataFrame({'Course': array[:, 0], 'Fee': array[:, 1], 'Discount': array[:, 2]})
print(df)

Yields below output.

## 5. Convert Array to DataFrame using from_records()

In this example, I will create 2-D NumPy array using arange() and reshape() function. I will use this array and convert to DataFrame using from_records() function.

# Create a numpy array
array = np.arange(6).reshape(2, 3)
print(array)
# Output:
# [[0 1 2]
# [3 4 5]]

Let’s convert DataFrame from the array by using the from_records() function. In this syntax pass array into the from_records() function. For example,

# Convert array to DataFrame
df = pd.DataFrame.from_records(array)
print(df)

# Output:
#    0  1  2
# 0  0  1  2
# 1  3  4  5

## 6. Convert Multiple Arrays to DataFrame

If you have two NumPy array objects and wanted to convert them to DataFrame, assign each array as a column to Dataframe. Here, I am using arrange() and random.rand() functions to create array.

#  create an array
arr1  = np.arange(start = 1, stop = 10, step = 1).reshape(-1)
arr2 = np.random.rand(9).reshape(-1)
print(arr1)
print(arr2)
print(arr1.shape)
print(arr2.shape)

Yields below output.

# Output:
[1 2 3 4 5 6 7 8 9]
[0.25903374 0.98651581 0.94674926 0.16608304 0.99794979 0.38816292
0.50690008 0.03360956 0.37643328]
(9,)
(9,)

Let’s convert the array to DataFrame, In order to create DataFrame, we can set the above arrays as a column of DataFrame which we want to convert from given arrays.

# Convert array to DataFrame
df = pd.DataFrame()
df['col1'] = arr1
df['col2'] = arr2

Yields below output.

# Output:
col1      col2
0     1  0.259034
1     2  0.986516
2     3  0.946749
3     4  0.166083
4     5  0.997950

## 7. Another Example to Convert

In this example, I will create NumPy array using numpy.array() and I will use this array to convert DataFrame.

# Create an array
array = np.array([['Courses', 'Fee'], ['Spark', 'PySpark'], [20000, 25000]])
print(array)

# Output :
# [['Courses' 'Fee']
#  ['Spark' 'PySpark']
#  ['20000' '25000']]

Let’s convert the above array to DataFrame using Python for loop and pd.transpose() function. For example,

# Convert array DataFrame
df = pd.DataFrame(i for i in array).transpose()
df.drop(0, axis=1, inplace=True)
df.columns = array[0]
print(df)

# Output:
#    Courses    Fee
# 0    Spark  20000
# 1  PySpark  25000

Why would I want to convert a NumPy array to a Pandas DataFrame?

Pandas DataFrames provide a tabular data structure with labeled axes (rows and columns). Converting a NumPy array to a Pandas DataFrame allows you to leverage the additional features and functionality provided by Pandas, such as labeled indexing, convenient data manipulation methods, and seamless integration with other data analysis tools.

How can I convert a NumPy array to a Pandas DataFrame if I have custom column names and index labels?

You can convert a NumPy array to a Pandas DataFrame with custom column names and index labels using the pd.DataFrame constructor and providing the columns and index parameters.

Can I convert a NumPy array with mixed data types to a Pandas DataFrame?

Pandas DataFrames can handle mixed data types. When you convert a NumPy array to a DataFrame, Pandas will automatically infer the data types for each column. However, be aware that having mixed data types in a single column may result in the column being assigned the object data type, which may not be as efficient for numerical operations.

Are there any performance considerations when converting large NumPy arrays to Pandas DataFrames?

Converting large arrays to DataFrames involves creating a new data structure, and depending on the size of your data, it may have performance implications. When working with large datasets, it’s advisable to consider memory usage and potential performance bottlenecks. Additionally, make sure to optimize your data types and use Pandas features efficiently to avoid unnecessary overhead.

How can I access specific elements or slices in a Pandas DataFrame after converting from a NumPy array?

You can use various indexing and slicing methods provided by Pandas to access specific elements or slices. For example, you can use loc and iloc for label-based and integer-based indexing, respectively.

Can I convert a 1D NumPy array to a Pandas Series instead of a DataFrame?

You can convert a 1D NumPy array to a Pandas Series using the pd.Series constructor. For example, the pd.Series constructor is used with the 1D NumPy array as its argument. The resulting object, series, will be a Pandas Series with default integer index values. You can also provide custom index labels to the index parameter if you want to specify your own index labels.

## Conclusion

In this article, I have explained how to convert a Numpy array to Pandas DataFrame using various syntaxes with examples.

Happy learning !!