Common Questions
Before we dive deeper into SQLAlchemy, let's answer a possible list of questions regarding the ORM:
- Can you prevent SQLAlchemy from automatically creating a schema? Instead, can you bind SQLAlchemy models to an existing schema?
- Is there a performance overhead when using SQLAlchemy, compared to writing raw SQL? If so, how much?
- If you don't have enough permission to create tables in a database, does SQLAlchemy throw an exception?
- How is schema modified? Is it done automatically or do you write code to do it?
- Is there a support for triggers?
In this article, we are going to answer all the questions. Some of the questions will be covered in detail while others will be answered in summary and covered in another article.
SQLAlchemy Schema Reflection / Introspection
Instead of creating a schema automatically from the SQLAlchemy, as what's shown in the previous articles using Base.metadata.create_all(engine)
, we can also instruct a Table
object to load information about itself from the corresponding database schema object already existing within the database. rolex cellini replica vs real
Let's create an example sqlite3 database with one table person
that stores one record:
[python]
import sqlite3
conn = sqlite3.connect("example.db")
c = conn.cursor()
c.execute('''
CREATE TABLE person
(name text, email text)
''')
c.execute("INSERT INTO person VALUES ('john', '[email protected]')")
c.close()
[/python]
Now we can reflect the structure of table person
using the arguments autoload
and autoload_with
in the Table
constructor.
[python]
>>> from sqlalchemy import create_engine, MetaData, Table
>>>
>>> engine = create_engine('sqlite:///example.db')
>>> meta = MetaData(bind=engine)
>>> person = Table("person", meta, autoload=True, autoload_with=engine)
>>> person
Table('person', MetaData(bind=Engine(sqlite:///example.db)), Column(u'name', TEXT(), table=), Column(u'email', TEXT(), table=), schema=None)
>>> [c.name for c in person.columns]
[u'name', u'email']
[/python]
We can also reflect all tables in the database using the MetaData.reflect
method.
[python]
>>> meta = MetaData()
>>> meta.reflect(bind=engine)
>>> person = meta.tables['person']
>>> person
Table(u'person', MetaData(bind=None), Column(u'name', TEXT(), table=), Column(u'email', TEXT(), table=), schema=None)
[/python]
Albeit very powerful, reflection does have its limitations. It's important to remember reflection constructs Table
metadata using only information available in the relational database. Naturally, such a process cannot restore aspects of a schema that are not actually stored in the database. The aspects that are not available include but not limited to:
- Client side defaults, Python functions or SQL expressions defined using the
default
keyword of theColumn
constructor. - Column information, defined in the
Column.info
dictionary. - The value of the
.quote
setting forColumn
orTable
. - The association of a particular
Sequence
with a givenColumn
.
Recent improvements in SQLAlchemy allow structures like views, indexes and foreign key options to be reflected. Structures like CHECK constraints, table comments and triggers are not reflected.
Performance Overhead of SQLAlchemy
Since SQLAlchemy uses the unit of work pattern when synchronizing changes, i.e., session.commit()
, to the database, it does more than just "inserts" data as in a raw SQL statement. It tracks changes made to a session's object and maintain an identity map for all the objects. It also performs a fair bit amount of bookkeeping and maintains the integrity of any CRUD operations. Overall, unit of work automates the task of persisting a complex object graph into a relational database without writing explicit procedural persistence code. Of course, such an advanced automation has a price.
Since SQLAlchemy's ORM is not designed to deal with bulk insertions, we can write an example to test its efficiency against raw SQL. Besides the ORM and raw SQL implementation of a bulk insertion test case, we also implement a version that uses SQLAlchemy's Core system. Since SQLAlchemy's Core is a thin layer of abstraction above the raw SQL, we expect it to achieve comparable level of performance to raw SQL.
[python]
import time
import sqlite3
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
Base = declarative_base()
session = scoped_session(sessionmaker())
class User(Base):
__tablename__ = "user"
id = Column(Integer, primary_key=True)
name = Column(String(255))
def init_db(dbname='sqlite:///example.db'):
engine = create_engine(dbname, echo=False)
session.remove()
session.configure(bind=engine, autoflush=False, expire_on_commit=False)
Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
return engine
def test_sqlalchemy_orm(number_of_records=100000):
init_db()
start = time.time()
for i in range(number_of_records):
user = User()
user.name = 'NAME ' + str(i)
session.add(user)
session.commit()
end = time.time()
print "SQLAlchemy ORM: Insert {0} records in {1} seconds".format(
str(number_of_records), str(end - start)
)
def test_sqlalchemy_core(number_of_records=100000):
engine = init_db()
start = time.time()
engine.execute(
User.__table__.insert(),
[{"name": "NAME " + str(i)} for i in range(number_of_records)]
)
end = time.time()
print "SQLAlchemy Core: Insert {0} records in {1} seconds".format(
str(number_of_records), str(end - start)
)
def init_sqlite3(dbname="sqlite3.db"):
conn = sqlite3.connect(dbname)
cursor = conn.cursor()
cursor.execute("DROP TABLE IF EXISTS user")
cursor.execute("CREATE TABLE user (id INTEGER NOT NULL, name VARCHAR(255), PRIMARY KEY(id))")
conn.commit()
return conn
def test_sqlite3(number_of_records=100000):
conn = init_sqlite3()
cursor = conn.cursor()
start = time.time()
for i in range(number_of_records):
cursor.execute("INSERT INTO user (name) VALUES (?)", ("NAME " + str(i),))
conn.commit()
end = time.time()
print "sqlite3: Insert {0} records in {1} seconds".format(
str(number_of_records), str(end - start)
)
if __name__ == "__main__":
test_sqlite3()
test_sqlalchemy_core()
test_sqlalchemy_orm()
[/python]
In the previous code, we compare the performance of bulk inserting 100000 user records into a sqlite3 database using raw SQL, SQLAlchemy's Core and SQLAlchemy's ORM. If you run the code, you will get an output similar to the following:
[shell]
$ python orm_performance_overhead.py
sqlite3: Insert 100000 records in 0.226176977158 seconds
SQLAlchemy Core: Insert 100000 records in 0.371157169342 seconds
SQLAlchemy ORM: Insert 100000 records in 10.1760079861 seconds
[/shell]
Notice that the Core and raw SQL achieved comparable insertion speed while the ORM is much slower than the other two. Although it looks like the ORM incurs a large performance overhead, keep in mind that the overhead becomes significant only when there is a large amount of data to be inserted. Since most web applications run small CRUD operations in one request-response cycle, it's preferred to using the ORM instead of the Core due to the extra convenience and better maintainability.
SQLAlchemy and database permissions
So far, our examples have been working well with sqlite3 databases, which do not have fine-grained access control such as user and permission management. What if we want to use SQLAlchemy with MySQL or PostgreSQL? What happens when the user connected to the database does not have enough permission to create tables, indexes, etc.? Will SQLAlchemy throw a database access exception?
Let's use an example to test the behaviour of SQLAlchemy's ORM when there is not enough permissions given to a user. First, we create a testing database "test_sqlalchemy" and a testing user "sqlalchemy".
[shell]
$ psql
postgres=# create database test_sqlalchemy;
CREATE DATABASE
postgres=# create user sqlalchemy with password 'sqlalchemy';
CREATE ROLE
postgres=# grant all on database test_sqlalchemy to sqlalchemy;
GRANT
[/shell]
For now, the testing user "sqlalchemy" has all access privileges towards the testing database "test_sqlalchemy". Therefore, we expect the database initialization call to succeed and insert one record into the database "test_sqlalchemy".
[python]
import time
import sqlite3
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
Base = declarative_base()
session = scoped_session(sessionmaker())
class User(Base):
__tablename__ = "user"
id = Column(Integer, primary_key=True)
name = Column(String(255))
def init_db(dbname):
engine = create_engine(dbname, echo=False)
session.configure(bind=engine, autoflush=False, expire_on_commit=False)
Base.metadata.create_all(engine)
return engine
if __name__ == "__main__":
init_db("postgresql://sqlalchemy:sqlalchemy@localhost/test_sqlalchemy")
u = User(name="other_user")
session.add(u)
session.commit()
session.close()
[/python]
After executing the script, you can check that there is a new User
record in the "user" table.
[shell]
$ psql test_sqlalchemy
psql (9.3.3)
Type "help" for help.
test_sqlalchemy=# select * from "user";
id | name
----+------------
1 | other_user
[/shell]
Now suppose we take away the insertion permission from the testing user "sqlalchemy". Then we should expect that running the same code will fail with an exception.
[shell]
# inside a psql shell
test_sqlalchemy=# revoke INSERT on "user" from sqlalchemy;
REVOKE
# inside a bash shell
$ python permission_example.py
Traceback (most recent call last):
File "permission_example.py", line 32, in
session.commit()
File "/home/vagrant/pythoncentral/local/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 149, in do
return getattr(self.registry(), name)(*args, **kwargs)
File "/home/vagrant/pythoncentral/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 765, in commit
self.transaction.commit()
......
File "/home/vagrant/pythoncentral/local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 425, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) permission denied for relation user
'INSERT INTO "user" (name) VALUES (%(name)s) RETURNING "user".id' {'name': 'other_user'}
[/shell]
As you can see, an exception was thrown indicating that we do not have the permission to insert records into the relation user.
SQLAlchemy's Schema Migration
There are at least two libraries available for performing SQLAlchemy migrations: migrate
documentation link and alembic
documentation link.
Since alembic
was written by the author of SQLAlchemy and actively developed, we recommend you to use it instead of migrate
. Not only does alembic
allow you to manually write migration scripts, it also provides a way to auto-generate the scripts. We will further explore how to use alembic
in another article.
SQLAlchemy's Support for Triggers
SQL triggers can be created using custom DDL constructs and hooked to SQLAlchemy's events. Although it's not a direct support for triggers, it's easy to implement and plug into any system. We will take a look at custom DDL and events in another article.
Tips and Summary
In this article, we answered a couple common questions regarding SQLAlchemy from a SQL database admin's point of view. Although SQLAlchemy defaults to create a database schema for you, it also allows you to reflect on an existing schema and generates Table
objects for you. There's a performance overhead when using SQLAlchemy's ORM, but it's mostly obvious when performing bulk insertions, while most web applications perform relatively small CRUD operations. If your database user does not have enough permissions to perform certain actions on a table, SQLAlchemy will throw an exception that shows exactly why you cannot perform the actions. There are two migration libraries for SQLAlchemy and alembic
is highly recommended. Although triggers are not directly supported, you can easily write them in raw SQL and hook them up using custom DDL and SQLAlchemy events.