Convert NumPy Array to Pandas DataFrame

We can convert the Numpy array to Pandas DataFrame by using various syntaxes. In this article, I will explain how to convert a Numpy array to Pandas DataFrame with examples.

1. Quick Examples to Convert Numpy Array to DataFrame

If you are in a hurry, below are some quick examples of how to convert the NumPy array to DataFrame.


# Below are quick examples

# Example 1: Convert 2-dimensional NumPy array
array = np.array([['Spark', 20000, 1000], ['PySpark', 25000, 2300], ['Python', 22000, 12000]])
df = pd.DataFrame({'Course': array[:, 0], 'Fee': array[:, 1], 'Discount': array[:, 2]})

# Example 2: Convert array to DataFrame using from_records()
array = np.arange(6).reshape(2, 3)
df = pd.DataFrame.from_records(array)

# Example 4: Convert array to DataFrame
arr1  = np.arange(start = 1, stop = 10, step = 1).reshape(-1)
arr2 = np.random.rand(9).reshape(-1)
df = pd.DataFrame()
df['col1'] = arr1
df['col2'] = arr2

# Example 5 : Convert using transpose() function
array = np.array([['Courses', 'Fee'], ['Spark', 'PySpark'], [20000, 25000]])
df = pd.DataFrame(i for i in array).transpose()
df.drop(0, axis=1, inplace=True)
df.columns = array[0]

2. What is Numpy Array?

We know that a NumPy array is a data structure (usually numbers) that holds values of the same type, similar to a list. But arrays are more efficient than Python lists and also much more compact. In case you want the data in an array you should able to convert Series to array or Pandas DataFrame to a Numpy array

Since our article is to convert NumPy Assay to DataFrame, Let’s Create NumPy array using np.array() function and then convert it to DataFrame.


# Create a 2 dimensional numpy array
array = np.array([['Spark', 20000, 1000], ['PySpark', 25000, 2300], ['Python', 22000, 1200]])
print(array)
print(type(array)) 

Yields below output.


# Output:
[['Spark' '20000' '1000']
 ['PySpark' '25000' '2300']
 ['Python' '22000'  '1200']]

3. What is DataFrame?

pandas DataFrame is a Two-Dimensional data structure, immutable, heterogeneous tabular data structure with labeled axes rows, and columns. pandas Dataframe consists of three components principal, data, rows, and columns. Pandas is built on the NumPy library and written in languages like PythonCython, and C.

4. Convert NumPy Array to Pandas DataFrame

Let’s convert the above NumPy array to DataFrame using the below syntax. In this syntax, I am using NumPy slicing technique for providing column elements to DataFrame.


# Convert array to DataFrame
df = pd.DataFrame({'Course': array[:, 0], 'Fee': array[:, 1], 'Discount': array[:, 2]})
print(df)

Yields below output.


# Output:
    Course    Fee Discount
0    Spark  20000     1000
1  PySpark  25000     2300
2   Python  22000     1200

5. Convert Array to DataFrame using from_records()

In this example, I will create 2-D NumPy array using arange() and reshape() function. I will use this array and convert to DataFrame using from_records() function.


# Create a numpy array
array = np.arange(6).reshape(2, 3)
print(array)
# Output:
# [[0 1 2]
# [3 4 5]]

Let’s convert DataFrame from the array by using the from_records() function. In this syntax pass array into the from_records() function. For example,


# Convert array to DataFrame
df = pd.DataFrame.from_records(array)
print(df)

# Output:
#    0  1  2
# 0  0  1  2
# 1  3  4  5

6. Convert Multiple Arrays to DataFrame

If you have two NumPy array objects and wanted to convert them to DataFrame, assign each array as a column to Dataframe. Here, I am using arrange() and random.rand() functions to create array.


#  create an array 
arr1  = np.arange(start = 1, stop = 10, step = 1).reshape(-1)
arr2 = np.random.rand(9).reshape(-1)
print(arr1)
print(arr2)
print(arr1.shape)
print(arr2.shape)

Yields below output.


# Output:
[1 2 3 4 5 6 7 8 9]
[0.25903374 0.98651581 0.94674926 0.16608304 0.99794979 0.38816292
 0.50690008 0.03360956 0.37643328]
(9,)
(9,)

Let’s convert the array to DataFrame, In order to create DataFrame, we can set the above arrays as a column of DataFrame which we want to convert from given arrays.


# Convert array to DataFrame
df = pd.DataFrame()
df['col1'] = arr1
df['col2'] = arr2
print(df.head())

Yields below output.


# Output:
  col1      col2
0     1  0.259034
1     2  0.986516
2     3  0.946749
3     4  0.166083
4     5  0.997950

7. Another Example to Convert

In this example, I will create NumPy array using numpy.array() and I will use this array to convert DataFrame.


# Create an array 
array = np.array([['Courses', 'Fee'], ['Spark', 'PySpark'], [20000, 25000]])
print(array)

# Output :
# [['Courses' 'Fee']
#  ['Spark' 'PySpark']
#  ['20000' '25000']]

Let’s convert the above array to DataFrame using Python for loop and pd.transpose() function. For example,


# Convert array DataFrame
df = pd.DataFrame(i for i in array).transpose()
df.drop(0, axis=1, inplace=True)
df.columns = array[0]
print(df)

# Output:
#    Courses    Fee
# 0    Spark  20000
# 1  PySpark  25000

8. Conclusion

In this article, I have explained how to convert a Numpy array to Pandas DataFrame using various syntaxes with examples.

Happy learning !!

Vijetha

With 5 of experience in technical writing, I have had the privilege to work with a diverse range of technologies like Python, Pandas, NumPy and R. During this time, I have consistently demonstrated my ability to grasp intricate technical details and transform them into comprehensible materials.

Leave a Reply

You are currently viewing Convert NumPy Array to Pandas DataFrame