Pandas Read SQL Query or Table with Examples

Pandas read_sql() function is used to read data from SQL queries or database tables into DataFrame. This function allows you to execute SQL queries and load the results directly into a Pandas DataFrame. This function is a wrapper for the read_sql_query() and read_sql_table() functions, based on the input, it calls these functions internally and returns the SQL table as a two-dimensional data structure with labeled axes.

1. Syntax of Pandas read_sql()

Following are the syntax of read_sql(), read_sql_query() and read_sql_table() functions. All these functions return either DataFrame or Iterator[DataFrame].


# Syntax of read_sql()
pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)

# Syntax of read_sql_query()
pandas.read_sql_query(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, chunksize=None, dtype=None)

# Syntax of read_sql_table()
pandas.read_sql_table(table_name, con, schema=None, index_col=None, coerce_float=True, parse_dates=None, columns=None, chunksize=None)

Parameters:

sql
con
index_col
coerce_float
params
parse_dates
columns
chuynksize

Before we go into learning how to use pandas read_sql() and other functions, let’s create a database and table by using sqlite3.

2. Create a Database and Table

The below example can be used to create a database and table in Python by using the sqlite3 library. If you don’t have a sqlite3 library install it using the pip command. In order to use it first, you need to import it.


import sqlite3

# Connect to Database
con = sqlite3.connect('courses_database') 
cur = con.cursor()

# Create Table
cur.execute('''CREATE TABLE IF NOT EXISTS COURSES
          ([course_id] INTEGER PRIMARY KEY, 
           [course_name] TEXT, 
           [fee] INTEGER, 
           [duration] TEXT, 
           [discount] INTEGER)''')

Running the above script creates a new database called courses_database along with a table named courses.

3. Insert Rows into the Table

Now insert rows into the table by using execute() function of the Cursor object. Don’t forget to run the commit(), this saves the inserted rows into the database permanently.


# Insert Few Records
cur.execute('''INSERT INTO COURSES (course_id, course_name, fee, duration, discount)
                VALUES (1,'Spark',25000,'50 Days', 2000),
                (2,'Pandas',20000,'35 Days', 1000),
                (3,'Java',15000,'35 Days', 800),
                (4,'Python',15000,'30 Days', 500),
                (5,'PHP',28000,'30 Days', 800)''') 
                    
# Commit Inserts
con.commit()

4. Using pandas read_sql() query

Now by using Pandas read_sql() function load the table, as I said above, this can take either SQL query or table name as a parameter. since we are passing SQL query as the first param, it internally calls read_sql_query() function.


import sqlite3
import pandas as pd

# Connect to the database
con = sqlite3.connect('courses_database') 

# Run SQL          
sql_query = pd.read_sql('SELECT * FROM COURSES', con)

# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)

Yields below output.


# Output:
  course_id course_name    fee duration  discount
0         1       Spark  25000  50 Days      2000
1         2      Pandas  20000  35 Days      1000
2         3        Java  15000      NaN       800
3         4      Python  15000  30 Days       500
4         5         PHP  18000  30 Days       800

5. Using read_sql_query()

Similarly, you can also write the above statement directly by using the read_sql_query() function.


# Run SQL          
sql_query = pd.read_sql_query('SELECT * FROM COURSES', con)

# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)

Yields the same output as above.

6. Using read_sql_table()

In the above examples, I have used SQL queries to read the table into pandas DataFrame. Now let’s just use the table name to load the entire table using the read_sql_table() function. This loads all rows from the table into DataFrame.


# Read SQL table       
sql_query = pd.read_sql('COURSES', con)

# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)

Yields the same output.

7. Filter Rows from SQL

Most of the time you may not be required to read all rows from the SQL table, to load only selected rows based on a condition that use SQL with Where Clause.


# Run SQL          
sql_query = pd.read_sql_query('''SELECT * FROM COURSES where fee = 1500''', con)

# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)

Yields below output.


# Output:
  course_id course_name    fee duration  discount
0         3        Java  15000      NaN       800
1         4      Python  15000  30 Days       500

8. Using Cursor fetchall()

Alternatively, you can also use the DataFrame constructor along with Cursor.fetchall() to load the SQL table into DataFrame. The below example yields the same output as above.


# Run SQL 
cur = con.cursor()         
cur.execute('''SELECT * FROM COURSES where fee = 1500''')

# Convert SQL to DataFrame
df = pd.DataFrame(cur.fetchall(), columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)

Conclusion

In this pandas read SQL into DataFrame you have learned how to run the SQL query and convert the result into DataFrame. Also learned how to read an entire database table, only selected rows etc.

References

https://www.sqlite.org/