• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:19 mins read
You are currently viewing Convert Pandas Series to NumPy Array

Pandas Series.to_numpy() function is used to convert Series to NumPy array. This function returns a NumPy ndarray representing the values from a given Series or Index. You can also convert Series Index to a numpy array by using Index.array and pandas.index.values properties.

We know that a NumPy array is a data structure (usually numbers), all of the same type, similar to a list. But arrays are more efficient than Python lists and also much more compact hence we will be required to convert Series to array and Pandas DataFrame to a Numpy array. In this article, I will explain how to convert Series to NumPy array using multiple examples.

Key Points –

  • Use the .values attribute of the Pandas Series to convert it to a NumPy array.
  • NumPy arrays are more efficient for numerical computations and offer a wider range of mathematical operations.
  • Converting a Series to a NumPy array retains the underlying data but removes the index information.
  • The resulting NumPy array is a homogeneous data structure, unlike a Series which can hold mixed data types.
  • Converting to a NumPy array can be useful for interoperability with libraries that rely on NumPy arrays for data manipulation and analysis.

1. Quick Examples of Convert Series to Numpy Array

If you are in a hurry below are some quick examples of how to convert series to NumPy array.


# Below are some quick examples

# Example 1: Convert series to numpy array.
import pandas as pd
import numpy as np
Fee = pd.Series([20000, 22000, 15000, 26000, 19000])

# Example 2: Convert series to numpy array.
new_array = Fee.to_numpy() 

# Example 3: Convert DataFrame column to numpy 
df = pd.DataFrame({'Courses': ['Java', 'Spark', 'PySpark','Hadoop','C'],
                   'Fee': [15000, 17000, 27000, 29000, 12000],
                   'Discount': [1100, 800, 1000, 1600, 600]
                   },index=['a', 'b', 'c', 'd', 'e'])
new_array = np.array(df.index.values)

# Example 4: Convert DataFrame column to numpy array.
new_array = df['Discount'].to_numpy() 

# Example 5: Convert series to numpy using pandas.index.values property.
new_array = np.array(df.index.values)

# Example 6: Using pandas.index.to_numpy() function.
new_array = df.index.to_numpy()

# Example 7: Using pandas.index.array property.
new_array = np.array(df.index.array)

2. Syntax of Pandas Series.to_numpy()

Following is the syntax of the Series.to_numpy().


# Syntax of series.to_numpy().
Series.to_numpy(dtype=None, copy=False, na_value=NoDefault.no_default, **kwargs)

2.1 Parameters of the Series.to_numpy()

  • dtype: Data type which we are passing like str. Specifies the data type for the resulting NumPy array. If not specified, it will be inferred.
  • copy : [bool, default False] Ensures that the returned value is a not a view on another array. Determines whether the data should be copied. If set to False, a view of the data will be returned if possible. By default, it’s set to False.

2.2 Return Value

It returns NumPy ndarray.

3. Convert Pandas Series to NumPy Array

Pandas Series is a one-dimensional array capable of storing various data types (integer, string, float, python objects, etc.). We can easily convert the Pandas Series to list, Series to dictionary, and Series to tuple using the Series() method. In the pandas Series, the row labels of the Series are called the index using this we can access the elements of a Series. The Series can have only one column. A List, NumPy Array, and Dict can be turned into a Series. 

NumPy is an N-Dimensional that stores a collection of items of the same type. Items in the collection can be accessed using a zero-based index. Every item in a ndarray takes the same size of a block in the memory. Each element  ndarray is an object of the data-type object (called dtype).

Using pandas Series.to_numpy() function we can convert Series to NumPy array. Let’s create a Pandas Series and apply this function. It will return the NumPy ndarray.


# Convert series to numpy array.
import pandas as pd
import numpy as np

fee = pd.Series([20000, 22000, 15000, 26000, 19000])
print("Create Pandas Series:\n", fee)
new_array = fee.to_numpy() 
print("After converting a Series to NumPy array:\n", new_array)
print("Type of the object:\n", type(new_array))

Yields below output.

pandas series NumPy array

4. Convert DataFrame Column to NumPy Array

Every column in a DataFrame is represented as a Series hence you can convert Pandas DataFrame column to a numpy array by using df[column].to_numpy(). Here, df[column] returns a Series.


# Convert DataFrame column to NumPy array.
df = pd.DataFrame({'Courses': ['Java', 'Spark', 'PySpark','Hadoop','C'],
                   'Fee': [15000, 17000, 27000, 29000, 12000],
                   'Discount': [1100, 800, 1000, 1600, 600]},
                    index=['a', 'b', 'c', 'd', 'e'])

new_array = df['Discount'].to_numpy() 
print("After converting a Series to NumPy array:\n", new_array)
print("Type of the object:\n", type(new_array))

Yields below output.

pandas series NumPy array

5. Convert Index to NumPy Array

If you want to convert a Pandas DataFrame index to a NumPy array, you can use the pandas.Index.values property. The pandas.Index.values property returns the values at the index in the form of a NumPy array. You can also use np.array to convert it to a NumPy array.


# Convert DataFrame column to numpy array 
new_array = np.array(df.index.values)
print("After converting a Series to NumPy array:\n", new_array)
print("Type of the object:\n", type(new_array))

# Output:
# After converting a Series to NumPy array:
#  ['a' 'b' 'c' 'd' 'e']
# Type of the object:
#  <class 'numpy.ndarray'>

Here, We first created the Pandas series df with the pd.DataFrame() function. We then converted the df to an array with the df.index.values property and stored it inside the NumPy array array with the np.array() function.

6. Using Pandas.index.to_numpy() Function

The pandas.index.to_numpy() function directly converts the values inside the Pandas series to a NumPy array, so we do not need to use the numpy.array()function explicitly. And also a good replacement for the pandas.index.values property is the pandas.index.to_numpy() function.


# Using pandas.index.to_numpy() function.
new_array = df.index.to_numpy()
print("After converting a Series to NumPy array:\n", new_array)
print("Type of the object:\n", type(new_array))

# Output:
# After converting a Series to NumPy array:
#  ['a' 'b' 'c' 'd' 'e']
# Type of the object:
#  <class 'numpy.ndarray'>

7. Using Pandas.index.array property

Another method that can be used in place of the pandas.index.values property is the pandas.index.array property. The pandas.index.array property converts the Pandas series to an array.


# Using pandas.index.array property.
new_array = df.index.array
print("After converting a Series to NumPy array:\n", new_array)
print("Type of the object:\n", type(new_array))

# Output:
# After converting a Series to NumPy array:
#  ['a' 'b' 'c' 'd' 'e']
# Type of the object:
#  <class 'numpy.ndarray'>

Frequently Asked Questions on Convert Series to NumPy Array

Why should I convert a Pandas Series to a NumPy array?

Converting to a NumPy array enables efficient numerical computations and interoperability with other libraries that rely on NumPy arrays.

How do I convert a Pandas Series to a NumPy array?

You can convert a Pandas Series to a NumPy array using either the .values attribute or the to_numpy() method.

What happens to the index when I convert a Series to a NumPy array?

When you convert a Series to a NumPy array, the index information is discarded. The resulting NumPy array contains only the values from the Series, without any index labels. This is because NumPy arrays are inherently one-dimensional, homogeneous data structures, whereas Pandas Series can have arbitrary index labels associated with each value.

Is the data copied during conversion?

By default, when you convert a Pandas Series to a NumPy array using the .values attribute or the to_numpy() method, the data is not copied. Instead, a view of the data is returned whenever possible to improve performance and reduce memory overhead. This means that changes made to the NumPy array will also affect the original Pandas Series, and vice versa, because they share the same underlying data.

Is there any difference between using .values and to_numpy() for conversion?

Both methods produce the same result. However, to_numpy() provides more flexibility, allowing you to specify data types and copy behavior.

Can a Pandas Series with mixed data types be converted to a NumPy array?

A Pandas Series with mixed data types can be converted to a NumPy array. However, NumPy arrays are homogeneous data structures, meaning that all elements must have the same data type. When you convert a Pandas Series with mixed data types to a NumPy array, Pandas will attempt to find a common data type that can represent all the values in the Series. This process is called data type coercion or casting.

Conclusion

In this article, You have learned how to convert the pandas series to a NumPy array by using the Series.to_numpy() function. This function is used to return a NumPy ndarray representing the values from a given Series or Index. You can convert the DataFrame index to a NumPy array by pandas.index.array and pandas.index.values properties.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium