You can convert pandas DataFrame to NumPy array by using to_numpy()
, to_records()
, index()
, and values()
methods. In this article, I will explain how to convert DataFrame (all or selected multiple columns) to NumPy array with examples.
Key Points –
- Use the
.to_numpy()
method for a direct and efficient conversion of a DataFrame to a NumPy array. - You can convert specific columns of a DataFrame to a NumPy array by selecting them before applying
.to_numpy()
. - The
.to_records()
method can be used to convert a DataFrame to a structured NumPy array, retaining index and column labels as attributes. - The
.values
attribute also converts a DataFrame to a NumPy array but is less preferred than.to_numpy()
due to potential dtype inconsistencies. - The DataFrame’s index can be separately converted to a NumPy array using
.index.to_numpy()
. - You can also access the
.values
attribute, but.to_numpy()
is the recommended method in recent Pandas versions.
Quick Examples of Converting DataFrame to Array
Following are quick examples of converting Pandas DataFrame to NumPy array.
# Quick examples of convert DataFrame to NumPy array
# Using df.to_numpy() method
result = df.to_numpy()
# Convert specific column to numpy array
df2=df['Courses'].to_numpy()
# Convert specific columns
# Using df.to_numpy() method
df2 = df[['Courses', 'Duration']].to_numpy()
# Using DataFrame.to_records()
print(df.to_records())
# Convert Pandas DataFrame
# To numpy array by df.Values()
values_array = df.values
print(values_array)
# Convert row index method
df.index.to_numpy()
To run some examples of converting pandas DataFrame to NumPy array, let’s create Pandas DataFrame using data from a dictionary.
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
Yields below output.
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
Convert DataFrame to Array
You can convert pandas DataFrame to NumPy array by using to_numpy()
method. This method is called on the DataFrame object and returns an object of type Numpy ndarray and it accepts three optional parameters.
dtype
– To specify the datatype of the values in the array.copy
–copy=True
makes a new copy of the array andcopy=False
returns just a view of another array.False
is default and it’ll return just a view of another array, if it exists.na_value
– To specify a value to be used for any missing value in the array. You can pass any value here.
For Example-
# Using df.to_numpy() method
# To concert all columsn to numpy array
result = df.to_numpy()
print(result)
# Output
#[['Spark' 20000 '30days' 1000]
# ['PySpark' 25000 '40days' 2300]
# ['Python' 22000 '35days' 1200]
# ['pandas' 30000 '50days' 2000]]
Alternatively, to convert specific columns from a Pandas DataFrame to a NumPy array, you can select the columns using bracket notation []
and then use the to_numpy()
function. This allows you to choose the columns you want to convert and obtain their NumPy array representation.
# Convert specific rows using to_numpy() method
df2=df['Courses'].to_numpy()
print(df2)
# Outputs:
# ['Spark' 'PySpark' 'Python' 'pandas']
# Convert specific columns using df.to_numpy() method
result = df[['Courses', 'Duration']].to_numpy()
print(result)
# Output:
#[['Spark' '30days']
# ['PySpark' '40days']
# ['Python' '35days']
# ['pandas' '50days']]
Using DataFrame.Values() Method
In this section, you’ll convert the pandas DataFrame into a NumPy array using df.values()
. The values method returns the NumPy array representation of the DataFrame. As a result, the row and column axes (labels) are not present.
# Convert Pandas DataFrame
# To numpy array by df.Values() method
values_array = df.values
print(values_array)
Yields below output.
[['Spark' 20000 '30days' 1000]
['PySpark' 25000 '40days' 2300]
['Python' 22000 '35days' 1200]
['pandas' 30000 '50days' 2000]]
Convert DataFrame to NumPy Array using to_records()
In order to get the rows axis on the NumPy array from DataFrame use DataFrame.to_records()
method.
# Using DataFrame.to_records()
print(df.to_records())
Yields below output.
[('r1', 'Spark', 20000, '30days', 1000)
('r2', 'PySpark', 25000, '40days', 2300)
('r3', 'Python', 22000, '35days', 1200)
('r4', 'pandas', 30000, '50days', 2000)]
Using Index.to_numpy() to Convert Row Indices to NumPy
Use Index.to_numpy()
method to convert DataFrame row labels to NumPy array.
# Using DataFrame.index.to_numpy() method
df.index.to_numpy()
Yields below output.
array(['r1', 'r2', 'r3', 'r4'], dtype=object)
Complete Example
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Using df.to_numpy() method
print(df.to_numpy())
# Convert specific columns
# Using df.to_numpy() method
df[['Courses', 'Duration']].to_numpy()
# Using DataFrame.index method
df.index.to_numpy()
# Convert specific rows
# Uing to_numpy() method
df2=df['Courses'].to_numpy()
print(df2)
# Using DataFrame.to_records()
print(df.to_records())
# Convert Pandas DataFrame
# To numpy array by df.Values() method
values_array = df.values
print(values_array)
# Convert select Columns into Numpy array
Fee_array=df[['Fee']].to_numpy()
print(Fee_array)
Conclusion
In conclusion, this article has converted DataFrame to an array by using to_numpy()
, to_records()
, index()
, and values()
methods. To convert the selected columns, first select the columns from DataFrame by using bracket notation []
and on the result use to_numpy()
function. Also, learned how to get a row index into the array.
Related Articles
- How to Unpivot DataFrame in Pandas?
- How to Change Column Name in Pandas
- Convert Pandas Series to NumPy Array
- Convert NumPy Array to DataFrame
- Pandas Convert Integer to String in DataFrame
- Pandas Convert Floats to Strings in DataFrame
- Pandas Convert Boolean to String in DataFrame
- How to Convert NumPy Array to Pandas Series?
- Create Pandas DataFrame With Working Examples
- Pandas Create Conditional Column in DataFrame
- Create New DataFrame By Selecting Specific Columns
- Convert Multiple Columns to String in Pandas DataFrame