Tips

There are also specific tips in each documentation section, and many of the classes, functions, and attributes.

SQLite is different

While SQLite provides a SQL database like many others out there, it is also unique in many ways. Read about the unique features at the SQLite website and quirks.

Tip

Using APSW best practice is recommended to get best performance and avoid common mistakes.

Types

SQLite has 5 storage types.

SQLite	Python
NULL	`None`
Text (limit 1GB when encoded as bytes)	`str`
Integer (Signed 64 bit)	`int`
Float (IEEE754 64 bit)	`float`
BLOB (binary data, limit 1GB)	`bytes` and similar such as `bytearray` and `array.array`. Anything that implements the buffer protocol is accepted.

Dates and times do not have a dedicated storage type, but do have a variety of functions for creating, manipulating, and storing them. JSON does not have a dedicated storage type, but does have a variety of functions for creating, manipulating, and storing JSON.

APSW provides optional type conversion, but the underlying storage will always be one of the 5 storage types.

If a column declaration gives a type then SQLite attempts conversion.

connection.execute("""
    create table types1(a, b, c, d, e);
    create table types2(a INTEGER, b REAL, c TEXT, d, e BLOB);
    """)

data = ("12", 3, 4, 5.5, b"\x03\x72\xf4\x00\x9e")
connection.execute("insert into types1 values(?,?,?,?,?)", data)
connection.execute("insert into types2 values(?,?,?,?,?)", data)

for row in connection.execute("select * from types1"):
    print("types1", repr(row))

for row in connection.execute("select * from types2"):
    print("types2", repr(row))

types1 ('12', 3, 4, 5.5, b'\x03r\xf4\x00\x9e')
types2 (12, 3.0, '4', 5.5, b'\x03r\xf4\x00\x9e')

Runtime Python objects

While SQLite only stores 5 types, it is possible to pass Python objects into SQLite, operate on them with your functions (including window, aggregates), and to return them in results.

This is done by wrapping the value in apsw.pyobject() when supplying it in a binding or function result. See the example.

It saves having to convert working objects into SQLite compatible ones and back again. It is very useful if you work with numpy. Any attempt to save the objects to the database or provide them to SQLite provided functions results in them being seen as null.

Behind the scenes the pointer passing interface is used.

Transactions

Transactions are the changes applied to a database file as a whole. They either happen completely, or not at all. SQLite notes all the changes made during a transaction, and at the end when you commit will cause them to permanently end up in the database. If you do not commit, or just exit, then other/new connections will not see the changes and SQLite handles tidying up the work in progress automatically.

Committing a transaction can be quite time consuming. SQLite uses a robust multi-step process that has to handle errors that can occur at any point, and asks the operating system to ensure that data is on storage and would survive a power cycle. This will limit the rate at which you can do transactions.

If you do nothing, then each statement is a single transaction:

# this will be 3 separate transactions
db.execute("INSERT ...")
db.execute("INSERT ...")
db.execute("INSERT ...")

You can use BEGIN/COMMIT to set the transaction boundary:

# this will be one transaction
db.execute("BEGIN")
db.execute("INSERT ...")
db.execute("INSERT ...")
db.execute("INSERT ...")
db.execute("COMMIT")

However that is extra effort, and also requires error handling. For example if the second INSERT failed then you likely want to ROLLBACK the incomplete transaction, so that additional work on the same connection doesn’t see the partial data.

If you use with Connection then the transaction will be automatically started, and committed on success or rolled back if exceptions occur:

# this will be one transaction with automatic commit and rollback
with db:
    db.execute("INSERT ...")
    db.execute("INSERT ...")
    db.execute("INSERT ...")

There are technical details at the SQLite site.

Queries

SQLite only calculates each result row as you request it. For example if your query returns 10 million rows, SQLite will not calculate all 10 million up front. Instead the next row will be calculated as you ask for it. You can use Cursor.fetchall() to get all the results.

Cursors on the same Connection are not isolated from each other. Anything done on one cursor is immediately visible to all other cursors on the same connection. This still applies if you start transactions. Connections are isolated from each other.

Connection.execute() and Connection.executemany() automatically obtain cursors from Connection.cursor() which are very cheap. It is best practise to not re-use them, and instead get a new one each time. If you don’t, code refactoring and nested loops can unintentionally use the same cursor object which will not crash but will cause hard to diagnose behaviour in your program.

Bindings

When issuing a query, always use bindings. String interpolation may seem more convenient but you will encounter difficulties. You may feel that you have complete control over all data accessed but if your code is at all useful then you will find it being used more and more widely. The computer will always be better than you at parsing SQL and the bad guys have years of experience finding and using SQL injection attacks in ways you never even thought possible.

The tour shows why you use bindings, and the different ways you can supply them.

Query Patterns

These are suggestions on how to structure your Python code for processing queries.

Zero or more rows expected

Use a for loop. Note that nothing enforces the Python variables match the columns inside the SQL.

for name, quantity, status in db.execute("SELECT name, quantity, status FROM ..."):
  # do something with the row
  ...

You can use apsw.ext.DataClassRowFactory to get the row as dataclasses. It is strongly recommended that you provide the SQL level names using AS since there is no guarantee what the names will be otherwise.

import apsw.ext

# This affects all queries on db, but not get.  It can be set on
# a cursor to only affect that cursor.
db.row_trace = apsw.ext.DataClassRowFactory()

for row in db.execute("SELECT cat.name AS name, orders.quantity AS quantity FROM ..."):
  # You can access row names
  print(f"{row.name=} {row.quantity=}")
  # You will get an Exception with a wrong name
  print(row.status)

One value expected

Use get which will return the value or None if there was no match.

name = db.execute("SELECT name FROM ... WHERE id=?", (item_id,)).get

One row expected

get can be used. There will be an exception if no row was found because None can’t be unpacked into the variables.

name, status = db.execute("SELECT name, status FROM ... WHERE id=?", (item_id,)).get

match. can handle None when the row was not found.

match db.execute("SELECT name, status FROM ... WHERE id=?", (item_id,)).get:
  case None:
    # handle missing
    raise NotFound(...)
  case name, status:
    # do something
    ...

Diagnostics

Both SQLite and APSW provide detailed diagnostic information. Errors will be signalled via an exception.

APSW ensures you have detailed information both in the stack trace as well as what data APSW/SQLite was operating on.

SQLite has a warning/error logging facility. Use best practice to forward SQLite log messages to Python’s logging.

Managing and updating your schema

If your program uses SQLite for data then you’ll need to manage and update your schema. The hard way of doing this is to test for the existence of tables and their columns, and doing that maintenance programmatically. The easy way is to use pragma user_version as in this example where each number handles the changes needed.:

def ensure_schema(db):
  # a new database starts at user_version 0
  if db.pragma("user_version") == 0:
    with db:
      db.execute("""
        CREATE TABLE foo(x,y,z);
        CREATE TABLE bar(x,y,z);
        PRAGMA user_version = 1;""")

  if db.pragma("user_version") == 1:
    with db:
      db.execute("""
      CREATE TABLE baz(x,y,z);
      CREATE INDEX ....
      PRAGMA user_version = 2;""")

  if db.pragma("user_version") == 2:
    with db:
      db.execute("""
      ALTER TABLE .....
      PRAGMA user_version = 3;""")

This approach will automatically upgrade the schema as you expect. You can also use pragma application_id to mark the database as made by your application.

Parsing SQL

Sometimes you want to know what a particular SQL statement does. Use apsw.ext.query_info() which will provide as much detail as you need.

Busy handling

SQLite uses locks to coordinate access to the database by multiple connections (within the same process or in a different process). The general goal is to have the locks be as lax as possible (allowing concurrency) and when using more restrictive locks to keep them for as short a time as possible. See the SQLite documentation for more details.

By default you will get an immediate BusyError if a lock cannot be acquired. Use best practice which sets a short waiting period, as well as enabling WAL which reduces contention between readers and writers.

Database schema

When starting a new database, it can be quite difficult to decide what tables and column to have and how to link them. The technique used to design SQL schemas is called normalization. The page also shows common pitfalls if you do not normalize your schema.

Write Ahead Logging

SQLite has write ahead logging which has several benefits, but also some drawbacks as the page documents. WAL mode is off by default. Use best practice to automatically enable it for all connections.

Note that if wal mode can’t be set (eg the database is in memory or temporary) then the attempt to set wal mode will be ignored. It is also harmless to call functions like Connection.wal_autocheckpoint() on connections that are not in wal mode.

If you write your own VFS, then inheriting from an existing VFS that supports WAL will make your VFS support the extra WAL methods too.

Customizing Connections

apsw.connection_hooks is a list of callbacks for when each Connection is created. They are called in turn, with the new connection as the only parameter.

For example if you wanted to add an executescript method to Connections that is like Connection.execute() but ignores all returned rows:

def executescript(self, sql, bindings=None):
  for _ in self.execute(sql, bindings):
    pass

def my_hook(connection):
  connection.executescript = executescript

apsw.connection_hooks.append(my_hook)

Customizing Cursors

You can customize the behaviour of cursors. An example would be wanting a rowcount or batching returned rows. (These don’t make any sense with SQLite but the desire may be to make the code source compatible with other database drivers).

Set Connection.cursor_factory to any callable, which will be called with the connection as the only parameter, and return the object to use as a cursor.

URI names

SQLite allows URI filenames where you can provide additional parameters at the time of open for a database. Opens can include the SQLITE_OPEN_URI flag, which will also apply to ATTACH on that connection.

You should use urllib.parse to correctly create strings handling the necessary special characters and quoting.

import urllib.parse

uri_filename = urllib.parse.quote("my db filename.sqlite3")

uri_parameters = urllib.parse.urlencode(
  {
      "vfs": "memdb",
      "go": "fast",
      "level": 42,
  }
)

uri = f"file:{uri_filename}?{uri_parameters}"

Memory databases

You can get an in-memory only database by using a filename of :memory: and a temporary disk backed database with a name of an empty string. (Note shared cache won’t work,)

SQLite has a (currently undocumented) VFS that allows the same connection to have multiple distinct memory databases, and for separate connections to share a memory database.

Use the name memdb as the VFS. If the filename provided starts with a / then it is shared amongst connections, otherwise it is private to the connection.

# normal opens
connection = apsw.Connection("/shared", vfs="memdb")
connection = apsw.Connection("not-shared", vfs="memdb")

# using URI
connection = apsw.Connection("file:/shared?vfs=memdb",
                flags=apsw.SQLITE_OPEN_URI | apsw.SQLITE_OPEN_READWRITE)
connection = apsw.Connection("file:not-shared?vfs=memdb",
                flags=apsw.SQLITE_OPEN_URI | apsw.SQLITE_OPEN_READWRITE)