Pandas Convert Column to Numpy Array

We can convert the pandas DataFrame column to a Numpy array by using to_numpy() and values() functions. Using the to_numpy() function we can convert the whole DataFrame to a NumPy array. Pandas provide various functions to manipulate or analyze our data. Using some of these functions we can easily convert one data structure to another data structure.

In this article, I will explain how to convert the DataFrame column to a Numpy array using various functions and attributes with examples.

1. Quick Examples to Convert DataFrame Column to Numpy Array

If you are in a hurry, below are some quick examples of how to convert the DataFrame column to a NumPy array.


# Below are quick examples
# Example 1: Convert specific column use to_numpy()
array = df['Courses'].to_numpy()

# Example 2: Convert all columns to numpy array
array = df.to_numpy()

# Example 3: Convert df column to array using df.Values 
array = df['Fee'].values

# Example 4: Convert Pandas column to array use slicing 
array = df[df.columns[3:]].to_numpy()

# Example 5: Convert column to NumPy array use iloc[]
array = df.iloc[:,-1:].values

# Example 6: Convert column name to array
array = (df.columns.to_numpy())

Now, let’s create a pandas DataFrame with a few rows and columns, execute these examples and validate the results. Our DataFrame contains column names CoursesFeeDuration, and Discount.


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

Yields below output.

pandas convert column array
Pandas DataFrame

2. Convert Pandas DataFrame column to NumPy Array

We can convert the pandas DataFrame column to a NumPy array by using the to_numpy() function. Let’s see how to convert specific (single or multiple) columns from DataFrame to the NumPy array, first select the specified column from DataFrame by using bracket notation [] then, call the to_numpy() function. It will convert a specified column of Pandas DataFrame to a NumPy array.


# Convert specific column use to_numpy()
array = df['Courses'].to_numpy()
print(array)

# Convert specific columns 
array = df[['Courses', 'Duration']].to_numpy()
print(array)

Yield below output.

pandas convert column to array
NumPy array
pandas convert column to array
NumPy array

Moreover, using to_numpy() function we can convert the whole pandas DataFrame to a NumPy array. It returns two dimensional NumPy array.


# Convert all columns to numpy array
array = df.to_numpy()
print(array)

# Output
# [['Spark' 20000 '30days' 1000]
# ['PySpark' 25000 '40days' 2300]
# ['Python' 22000 '35days' 1200]
# ['pandas' 30000 '50days' 2000]]

3. Convert Pandas Column to Array using Values()

In this section, we’ll convert the pandas DataFrame column into a NumPy array using df['col_name'].values(). The values() function returns the NumPy array representation of the DataFrame. As a result, the row and column axis (labels) are not present. For example,


# Convert df column to array using df.Values 
array = df['Fee'].values
print(array)

# Output:
# [20000 25000 22000 30000]

4. Use Pandas Slicing with to_numpy() & Convert Array

As we know that using Pandas slicing we can select a particular portion of rows or columns of a given DataFrame. Here, I will select a specific portion of the column that we want to convert into a NumPy array and call the to_numpy() function. It will convert a specified column or portion of the column into a NumPy array.


# Convert Pandas column to array use slicing 
array = df[df.columns[3:]].to_numpy()
print(array)

# Output:
# [[1000]
# [2300]
# [1200]
# [2000]] 

5. Use Pandas iloc[] Attribute to Convert Array

Alternatively, using the Pandas iloc[] attribute we can select a specified column and then call values. This syntax will convert a specified column of DataFrame into a NumPy array.


# Convert column to NumPy array use iloc[]
array = df.iloc[:,-1:].values
print(array)
print(type(array))

# Output:
# [[1000]
#  [2300]
#  [1200]
# [2000]] 

6.  Convert Column Names to NumPy Array

Using df.columns function along with the to_numpy() function we can convert column names of Pandas DataFrame into a NumPy array. Let’s apply the below syntax and convert it into a NumPy array.


# Convert column name to array
array = (df.columns.to_numpy())
print(array)
print(type(array))

# Output:
# ['Courses' 'Fee' 'Duration' 'Discount']  

7. Conclusion

In this article, we have learned how to convert Pandas DataFrame column to NumPy array by using to_numpy() and values() functions and different attributes. Also, learned how to convert column names into an array.

Related Articles

References

Leave a Reply