Pandas read_sql()
function is used to read data from SQL queries or database tables into DataFrame. This function allows you to execute SQL queries and load the results directly into a Pandas DataFrame. This function is a wrapper for the read_sql_query()
and read_sql_table()
functions, based on the input, it calls these functions internally and returns the SQL table as a two-dimensional data structure with labeled axes.
I will use the following steps to explain Pandas read_sql() usage.
- Syntax of read_sql()
- Create database table on sqlite3
- Insert data into the table
- Execute SQL query by using pands red_sql().
- Use read_sql_query()
- Use read_sql_table()
- Filter rows from SQL table
1. Syntax of Pandas read_sql()
Following are the syntax of read_sql(), read_sql_query() and read_sql_table() functions. All these functions return either DataFrame or Iterator[DataFrame].
# Syntax of read_sql()
pandas.read_sql(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, columns=None, chunksize=None)
# Syntax of read_sql_query()
pandas.read_sql_query(sql, con, index_col=None, coerce_float=True, params=None, parse_dates=None, chunksize=None, dtype=None)
# Syntax of read_sql_table()
pandas.read_sql_table(table_name, con, schema=None, index_col=None, coerce_float=True, parse_dates=None, columns=None, chunksize=None)
Parameters:
- sql
- con
- index_col
- coerce_float
- params
- parse_dates
- columns
- chuynksize
Before we go into learning how to use pandas read_sql()
and other functions, let’s create a database and table by using sqlite3
.
2. Create a Database and Table
The below example can be used to create a database and table in Python by using the sqlite3 library. If you don’t have a sqlite3 library install it using the pip command. In order to use it first, you need to import it.
import sqlite3
# Connect to Database
con = sqlite3.connect('courses_database')
cur = con.cursor()
# Create Table
cur.execute('''CREATE TABLE IF NOT EXISTS COURSES
([course_id] INTEGER PRIMARY KEY,
[course_name] TEXT,
[fee] INTEGER,
[duration] TEXT,
[discount] INTEGER)''')
Running the above script creates a new database called courses_database along with a table named courses.
3. Insert Rows into the Table
Now insert rows into the table by using execute()
function of the Cursor
object. Don’t forget to run the commit()
, this saves the inserted rows into the database permanently.
# Insert Few Records
cur.execute('''INSERT INTO COURSES (course_id, course_name, fee, duration, discount)
VALUES (1,'Spark',25000,'50 Days', 2000),
(2,'Pandas',20000,'35 Days', 1000),
(3,'Java',15000,'35 Days', 800),
(4,'Python',15000,'30 Days', 500),
(5,'PHP',28000,'30 Days', 800)''')
# Commit Inserts
con.commit()
4. Using pandas read_sql() query
Now by using Pandas read_sql()
function load the table, as I said above, this can take either SQL query or table name as a parameter. since we are passing SQL query as the first param, it internally calls read_sql_query()
function.
import sqlite3
import pandas as pd
# Connect to the database
con = sqlite3.connect('courses_database')
# Run SQL
sql_query = pd.read_sql('SELECT * FROM COURSES', con)
# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)
Yields below output.
# Output:
course_id course_name fee duration discount
0 1 Spark 25000 50 Days 2000
1 2 Pandas 20000 35 Days 1000
2 3 Java 15000 NaN 800
3 4 Python 15000 30 Days 500
4 5 PHP 18000 30 Days 800
5. Using read_sql_query()
Similarly, you can also write the above statement directly by using the read_sql_query()
function.
# Run SQL
sql_query = pd.read_sql_query('SELECT * FROM COURSES', con)
# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)
Yields the same output as above.
6. Using read_sql_table()
In the above examples, I have used SQL queries to read the table into pandas DataFrame. Now let’s just use the table name to load the entire table using the read_sql_table()
function. This loads all rows from the table into DataFrame.
# Read SQL table
sql_query = pd.read_sql('COURSES', con)
# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)
Yields the same output.
7. Filter Rows from SQL
Most of the time you may not be required to read all rows from the SQL table, to load only selected rows based on a condition that use SQL with Where Clause
.
# Run SQL
sql_query = pd.read_sql_query('''SELECT * FROM COURSES where fee = 1500''', con)
# Convert SQL to DataFrame
df = pd.DataFrame(sql_query, columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)
Yields below output.
# Output:
course_id course_name fee duration discount
0 3 Java 15000 NaN 800
1 4 Python 15000 30 Days 500
8. Using Cursor fetchall()
Alternatively, you can also use the DataFrame constructor along with Cursor.fetchall()
to load the SQL table into DataFrame. The below example yields the same output as above.
# Run SQL
cur = con.cursor()
cur.execute('''SELECT * FROM COURSES where fee = 1500''')
# Convert SQL to DataFrame
df = pd.DataFrame(cur.fetchall(), columns = ['course_id', 'course_name', 'fee','duration','discount'])
print(df)
Conclusion
In this pandas read SQL into DataFrame you have learned how to run the SQL query and convert the result into DataFrame. Also learned how to read an entire database table, only selected rows etc.
Related Articles
- Pandas Read TSV with Examples
- Pandas Read Text with Examples
- Pandas Read Multiple CSV Files into DataFrame
- Pandas Read JSON File with Examples
- Pandas read_csv() with Examples
- How to Read CSV from String in Pandas
- Pandas Filter Rows Using IN Like SQL
- Pretty Print Pandas DataFrame or Series?
- Pandas Window Functions Explained
- Pandas DataFrame fillna() function