We can convert the Numpy array to Pandas DataFrame by using various syntaxes. In this article, I will explain how to convert a Numpy array to Pandas DataFrame with examples.
1. Quick Examples to Convert Numpy Array to DataFrame
If you are in a hurry, below are some quick examples of how to convert the NumPy array to DataFrame.
# Below are quick examples
# Example 1: Convert 2-dimensional NumPy array
array = np.array([['Spark', 20000, 1000], ['PySpark', 25000, 2300], ['Python', 22000, 12000]])
df = pd.DataFrame({'Course': array[:, 0], 'Fee': array[:, 1], 'Discount': array[:, 2]})
# Example 2: Convert array to DataFrame using from_records()
array = np.arange(6).reshape(2, 3)
df = pd.DataFrame.from_records(array)
# Example 4: Convert array to DataFrame
arr1 = np.arange(start = 1, stop = 10, step = 1).reshape(-1)
arr2 = np.random.rand(9).reshape(-1)
df = pd.DataFrame()
df['col1'] = arr1
df['col2'] = arr2
# Example 5 : Convert using transpose() function
array = np.array([['Courses', 'Fee'], ['Spark', 'PySpark'], [20000, 25000]])
df = pd.DataFrame(i for i in array).transpose()
df.drop(0, axis=1, inplace=True)
df.columns = array[0]
2. What is Numpy Array?
We know that a NumPy array is a data structure (usually numbers) that holds values of the same type, similar to a list. But arrays are more efficient than Python lists and also much more compact. In case you want the data in an array you should able to convert Series to array or Pandas DataFrame to a Numpy array
Since our article is to convert NumPy Assay to DataFrame, Let’s Create NumPy array using np.array() function and then convert it to DataFrame.
# Create a 2 dimensional numpy array
array = np.array([['Spark', 20000, 1000], ['PySpark', 25000, 2300], ['Python', 22000, 1200]])
print(array)
print(type(array))
Yields below output.
# Output:
[['Spark' '20000' '1000']
['PySpark' '25000' '2300']
['Python' '22000' '1200']]
3. What is DataFrame?
pandas DataFrame is a Two-Dimensional data structure, immutable, heterogeneous tabular data structure with labeled axes rows, and columns. pandas Dataframe consists of three components principal, data, rows, and columns. Pandas is built on the NumPy library and written in languages like Python
, Cython
, and C
.
4. Convert NumPy Array to Pandas DataFrame
Let’s convert the above NumPy array to DataFrame using the below syntax. In this syntax, I am using NumPy slicing technique for providing column elements to DataFrame.
# Convert array to DataFrame
df = pd.DataFrame({'Course': array[:, 0], 'Fee': array[:, 1], 'Discount': array[:, 2]})
print(df)
Yields below output.
# Output:
Course Fee Discount
0 Spark 20000 1000
1 PySpark 25000 2300
2 Python 22000 1200
5. Convert Array to DataFrame using from_records()
In this example, I will create 2-D NumPy array using arange() and reshape() function. I will use this array and convert to DataFrame using from_records() function.
# Create a numpy array
array = np.arange(6).reshape(2, 3)
print(array)
# Output:
# [[0 1 2]
# [3 4 5]]
Let’s convert DataFrame from the array by using the from_records()
function. In this syntax pass array into the from_records() function. For example,
# Convert array to DataFrame
df = pd.DataFrame.from_records(array)
print(df)
# Output:
# 0 1 2
# 0 0 1 2
# 1 3 4 5
6. Convert Multiple Arrays to DataFrame
If you have two NumPy array objects and wanted to convert them to DataFrame, assign each array as a column to Dataframe. Here, I am using arrange() and random.rand() functions to create array.
# create an array
arr1 = np.arange(start = 1, stop = 10, step = 1).reshape(-1)
arr2 = np.random.rand(9).reshape(-1)
print(arr1)
print(arr2)
print(arr1.shape)
print(arr2.shape)
Yields below output.
# Output:
[1 2 3 4 5 6 7 8 9]
[0.25903374 0.98651581 0.94674926 0.16608304 0.99794979 0.38816292
0.50690008 0.03360956 0.37643328]
(9,)
(9,)
Let’s convert the array to DataFrame, In order to create DataFrame, we can set the above arrays as a column of DataFrame which we want to convert from given arrays.
# Convert array to DataFrame
df = pd.DataFrame()
df['col1'] = arr1
df['col2'] = arr2
print(df.head())
Yields below output.
# Output:
col1 col2
0 1 0.259034
1 2 0.986516
2 3 0.946749
3 4 0.166083
4 5 0.997950
7. Another Example to Convert
In this example, I will create NumPy array using numpy.array() and I will use this array to convert DataFrame.
# Create an array
array = np.array([['Courses', 'Fee'], ['Spark', 'PySpark'], [20000, 25000]])
print(array)
# Output :
# [['Courses' 'Fee']
# ['Spark' 'PySpark']
# ['20000' '25000']]
Let’s convert the above array to DataFrame using Python for loop and pd.transpose() function. For example,
# Convert array DataFrame
df = pd.DataFrame(i for i in array).transpose()
df.drop(0, axis=1, inplace=True)
df.columns = array[0]
print(df)
# Output:
# Courses Fee
# 0 Spark 20000
# 1 PySpark 25000
8. Conclusion
In this article, I have explained how to convert a Numpy array to Pandas DataFrame using various syntaxes with examples.
Happy learning !!