Jupyter SQL Magic Persist
In Jupyter Lab, the %sql
magic command persist
creates a new table in the database to which we are connected. The table name will be the name as the name of the Python variable.
Example:
import pandas as pd
df = pd.read_csv('/tmp/data.csv')
df
column1 | column2 | |
---|---|---|
0 | 1 | a |
1 | 2 | b |
2 | 3 | b |
# set environment variable $DATABASE_URL
%sql postgresql://datacomy:PASSWORD@localhost/datacomy
'Connected: datacomy@datacomy'
%sql persist df
* postgresql://datacomy:***@localhost/datacomy
'Persisted df'
%%sql
SELECT tablename,tableowner
FROM pg_catalog.pg_tables
WHERE tableowner='datacomy';
* postgresql://datacomy:***@localhost/datacomy
6 rows affected.
tablename | tableowner |
---|---|
my_table1 | datacomy |
my_table2 | datacomy |
df | datacomy |
As we can see, the table ‘df’ was created.
If we want to create a table using the same name, PostgreSQL with throw an error ValueError: Table 'df' already exists.
%sql persist df
* postgresql://datacomy:***@localhost/datacomy
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[...]
ValueError: Table 'df' already exists.
After the table is created, we can execute SQL commands using Jupyter or any other RDBMS frontend:
%sql select * from df
* postgresql://datacomy:***@localhost/datacomy
3 rows affected.
index | column1 | column2 |
---|---|---|
0 | 1 | a |
1 | 2 | b |
2 | 3 | b |
Or from the PostgreSQL terminal: