Concurrency & Async ******************* .. currentmodule:: apsw How SQLite concurrency works ---------------------------- Each connection has a lock (`mutex `__ to protect the SQLite data structure. It is acquired on a call into the connection, and released on return of the call. The mutex can be acquired more times in the same thread, allowing nested calls, but cannot be acquired outside of the thread until the top level call in the original thread completes. This means you cannot get more concurrency per connection by using additional threads, although SQLite can do so internally (`pragma threads `__) such as for sorting. You can get concurrency with multiple connections. SQLite is inherently synchronous due to being written in C and using the C stack. How Python concurrency works ---------------------------- GIL (usual operation) A single lock (global interpreter lock) protects all Python data structures. It can only be held by one thread at a time. By default it will switch the active thread every 5ms (200 times a second), with the operating system scheduler choosing which thread runs next. The GIL can be released by C code when not using Python data structures to allow other threads Python to run. This is done during I/O and database operations etc. Free threaded (Python 3.14+) Each Python data structure gets its own lock. C code has to acquire and release the locks on individual Python objects being used, which allows code in other threads to run providing they are not using the same objects. The extra locking can result in around a 50% performance hit versus the single global lock in a single thread. The operating system scheduler can run all the threads at the same time. Async async is a language level concurrency mechanism, contrasted with the traditional library mechanism in :mod:`concurrent.futures`. It is done with the :code:`async` and :code:`await` keywords. An event loop does fine grained management of multiple tasks as their results become available, timeouts, cancellations, task groups etc, with the tasks cooperatively defining the points at which they can switch. There is typically a 50% hit to throughput, but latencies and time to complete are far more uniform. An async framework like :mod:`asyncio` or `trio `__ runs the event loop and chooses which code to run next. How APSW works -------------- GIL (usual operation) The GIL protects APSW Python objects. When making a SQLite call, the GIL is released and SQLite acquires its connection mutex. This means other Python code can run concurrently with SQLite connections. APSW is thread safe in that you can use any thread to make any call on any object, due to the GIL protection. Free threaded APSW has not yet been updated. (:issue:`568`) It requires extensive code changes to lock all APSW objects (connections, cursors, backups etc) as well as inputs and outputs such as lists for executemany. A free threaded build is available on pypi, but loading the module will re-enable GIL. You can use :code:`-X nogil` when starting Python and that won't happen, but you can get crashes if you use the same objects and inputs concurrently across threads, especially if making concurrent modifications. Async APSW fully supports async operation. This is done by running each connection in its own dedicated worker thread. Calls made in the event loop get forwarded to the worker thread, and the results can then be awaited. **All** callbacks such as user defined functions, virtual tables, various hooks, VFS etc can be async functions. The async function (coroutine) is forwarded from the worker thread back to the event loop with the worker thread blocked until getting a result back. .. _async_usage: Async usage ----------- APSW async usage has been developed and tested with :mod:`asyncio`, |trio|, and |anyio| with asyncio and trio event loops. This includes cancellations and deadlines/timeouts. There is a controller interface (described below) providing event loop integration, or you can write/adapt your own if you have more specialised needs, or a different async framework. Use :meth:`Connection.as_async` (a class method) to get an async :class:`Connection`. Related objects like :class:`Cursor`, :class:`Blob`, :class:`Backup`, :class:`Session` etc will also be async. .. code-block:: python db = await apsw.Connection.as_async("database.db") # note awaiting the db.execute call to get the cursor, and then # using async for to iterate async for name, price in await db.execute("SELECT name, price FROM ..."): print(f"{name=) {price=}")" There is no separate :code:`AsyncConnection` (or :code:`AsyncCursor`, :code:`AsyncBlob` etc) class. The existing instances know if they are in async mode or not, and behave appropriately. You can use :attr:`Connection.is_async` to check. However to make type checkers and IDEs work better, the type stubs included with APSW have those synthetic classes so it is clear when returned values are direct, or need to be awaited. You can use :meth:`Connection.async_run` to run your own functions in the async Connection worker thread. .. _anyio_note: AnyIO note ========== Version 4.11.0 (September 2025) or later is required for the APSW provided :class:`controller ` due to the mechanisms for calling back from the worker thread to the event loop. If you are using an earlier version of anyio, then APSW will use the corresponding event loop controller (:class:`~apsw.aio.AsyncIO` or :class:`~apsw.aio.Trio`). The main effect is that you may see :class:`trio.TooSlowError` on timeouts with a trio event loop. The APSW tests run using version 4 of the anyio API. .. _trio_note: Trio note ========= Version 0.20.0 (February 2022) or later is required for the APSW provided :class:`controller ` due to the mechanisms for supporting :mod:`contextvars`. You will need a custom controller to work with earlier versions. Attributes ========== Some SQLite functions are provided in APSW as attributes such as :attr:`Connection.authorizer`. For an async connection, you will need to await the result. .. code-block:: python auth = await connection.authorizer To set them, you will need to use :code:`setattr` in the worker thread. .. code-block:: python await connection.async_run(setattr, connection, "authorizer", my_auth) The type stubs will make this clear to your IDE and type checker. API results =========== Each API has an indicator of its behaviour in sync and async modes. You can find out if an object is in async mode by checking its corresponding :attr:`Connection.is_async`. The included type stubs will show correct usage for your IDE and type checker. .. _badge_async_sync: |badge-async-sync| Sync only !!!!!!!!! Sync object A direct result will be provided Async object A :exc:`TypeError` will be raised. There may be an async equivalent such as :code:`aclose` instead of :code:`close`, or you may need :code:`async with` instead of :code:`with`, :code:`async for` instead of :code:`for` .. _badge_async_async: |badge-async-async| Async only !!!!!!!!!! Sync object A :exc:`TypeError` will be raised. Omit the leasing :code:`async` and use plain :code:`with` and :code:`for`. There may be a sync equivalent without a leading :code:`a`. Async object You will need to :code:`await` the result when ready .. _badge_async_dual: |badge-async-dual| Sync / Async !!!!!!!!!!!! Sync object A direct result will be provided Async object You will need to :code:`await` the result when ready. When objects are returned like :class:`Cursor`, :class:`Blob`, :class:`Backup` etc, they will also be in async mode.` .. _badge_async_value: |badge-async-value| Value !!!!! Sync object A direct result will be provided Async object A direct result will be provided. Do not :code:`await` it. .. _badge_close: |badge-close| Close !!!!! You can call close on sync **and** async objects. When you call close on a :class:`Connection`, then it will close all the corresponding objects like :class:`Cursor`, :class:`Blob`, :class:`Session` etc. It is safe to call :code:`close` and :code:`aclose` multiple times, even on already closed objects.` Sync object :code:`close` closes this object in the foreground, releasing its held resources. Async object :code:`aclose` closes this object in the background. You should :code:`await` calling :code:`aclose` to know when it has completed. :code:`close` will close it in the foreground which could take some time. Callbacks ========= SQLite has numerous hooks and callbacks such as :meth:`functions `, :meth:`hooks `, :doc:`virtual tables `, and :doc:`VFS `. If you provide sync versions they get called in the connection worker thread. You can also provide an async callback/method. The async controller will suspend execution in the worker thread, send the callback back to the event loop, and resume execution on getting a result (or exception). .. warning:: DEADLOCK If your async callback makes a request back into the connection **and** awaits it, then you will get a deadlock. The connection cannot proceed until it gets a result, and the callback is waiting on the suspended connection. Contextvars =========== :mod:`contextvars` let you provide thread local and async context specific values. It saves having to provide a parameter to every function in a call chain, instead letting those that care reach out and find the current value for their context. contextvar values at the point of a query in the event loop are propagated to their processing in the database worker thread, being available to any callbacks, and are also propagated back to the event loop if any callbacks are async. .. warning:: Not a regular variable Each time you :meth:`~contextvars.ContextVar.set` a value, the previous value is saved behind the scenes so it can be :meth:`restored `. You will get ever increasing memory consumption if you do not :meth:`~contextvars.ContextVar.reset`. :func:`apsw.aio.contextvar_set` shows a convenient way of doing so. The context is copied (a trivial internal operation) as it passes from async to worker thread and back to async again. Setting a variable is not visible to code earlier in the call chain. Use a dict or similar as the value set by the initial code to provide somewhere all the code in the chain can see and make changes. Configuration ============= Most configuration uses :mod:`contextvars`. :attr:`apsw.async_cursor_prefetch` How many rows are fetched at once when iterating query results. :attr:`apsw.aio.check_progress_steps` How frequently running SQLite queries check for cancellations and timeouts :attr:`apsw.aio.deadline` When SQLite queries or async callbacks should timeout. |trio| and |anyio|) native timeouts are also supported. :attr:`apsw.async_controller` Interface between async framework and worker thread. :attr:`apsw.async_run_coro` How the worker thread runs an async callback back in the event loop. Deadlines and Cancellation ========================== The native cancellation of each framework is supported. This is often used to cancel all tasks in a group if one fails, and to support timeouts/deadlines. An example usage of deadlines is if you use a function or virtual table that makes network requests. When executing a query you can ensure reasonable bounds for how long it takes, bounding the internal functions and virtual tables used to answer the query. You can set a deadline by which an API request must timeout if not completed. This includes sync and async callbacks that are made to satisfy the request. The deadline is captured at the point the call is made, and subsequent changes are not observed. You can use :attr:`apsw.aio.deadline` to set the deadline - its documentation provides more details. Trio and anyio have timeout managers. If :attr:`apsw.aio.deadline` is not set, then their :code:`current_effective_deadline` used. * :func:`trio.fail_at`, :func:`trio.fail_after`, :func:`trio.current_effective_deadline` * :func:`anyio.fail_after`, :func:`anyio.current_effective_deadline` Async controllers ================= A controller configured via :attr:`apsw.async_controller` is used to integrate with the async framework. :mod:`apsw.aio` contains implementations for :mod:`asyncio`, |trio|, |anyio|, and auto-detection (the default). The controller is responsible for: * Starting the worker thread * Configuring the connection in the worker thread * Sending calls from the event loop to the worker thread with awaitable results * Checking deadlines and cancellations * Forwarding coroutines in the event loop, and providing their results * Stopping the worker thread when told about database close Although it seems like a lot, they are around 50 lines of code, and conform to the :class:`AsyncConnectionController` protocol. Run in thread (alternative) =========================== Instead of using APSW in async mode, you can request your framework run expensive operations in a thread. For example :func:`asyncio.to_thread`, :func:`trio.to_thread.run_sync` and :func:`anyio.to_thread.run_sync` can do that for you. Extensions ========== Session You should use :func:`apsw.aio.make_session` to make a :class:`Session` in async mode. :class:`apsw.fts5.Table` Virtually every method and property needs to access the database. Therefore you should run all of them in the database thread using :meth:`Connection.async_run` .. code-block:: python db = await apsw.Connection.as_async("my.db") # creating a table table = await db.async_run( # table creation method apsw.fts5.Table.create, # the various parameters db, "search", content="recipes", columns=... ) # loading existing table table = await db.async_run( # load method apsw.fts5.Table, # parameters db, "search" ) # row count (attribute) row_count = await db.async_run( getattr, table, "row_count" ) # query suggestions (method) suggest = await db.async_run( table.query_suggest, query ) apsw.aio module --------------- .. automodule:: apsw.aio :members: :undoc-members: :member-order: bysource Async Performance ----------------- Performance is dominated by the overhead of sending calls to the worker thread, and getting the result. :source:`tools/aio_bench.py` is a small benchmark that keeps reading rows from a dummy memory database, and then appending 1,000 more rows to the end of the table, until there are 300,000 rows in the table. **Benchmarks aren't real - use your own scenario for real testing!** Library apsw with :mod:`asyncio`, asyncio using |uvloop| as the inner loop, |trio|, and |anyio| with asyncio and trio event loops. The |aiosqlite| library (asyncio only) is included for comparison which also sends calls to a worker thread. Note that it doesn't support cancellation, timeouts, or async callbacks. Prefetch How many rows are fetched in a batch for queries, controlled by :attr:`apsw.async_cursor_prefetch` in APSW and :code:`iter_chunk_size` in |aiosqlite|. A value of 1 as shown in the first rows ends up as 301 thousand messages and responses with the worker thread. That is halved with 2 etc. The default is 64. The benchmark queries return a maximum of 1,000 rows. Wall Wall clock time in seconds for the configuration to run. CpuTotal / CpuEvtLoop / CpuDbWorker The total CPU time used in seconds, with how much of that was in the async event loop thread, and how much in the background database worker thread. The results show that what is used only matters if you are doing very large numbers of calls because of very small row batch sizes. .. csv-table:: Benchmark Results :widths: auto :stub-columns: 1 :header: "Library", "Prefetch", "Wall", "CpuTotal", "CpuEvtLoop", "CpuDbWorker" :class: aiobench-table apsw AsyncIO,1,7.721,8.010,3.168,4.841 apsw AsyncIO uvloop,1,5.064,5.256,1.521,3.735 apsw Trio,1,12.029,13.308,7.805,5.504 apsw AnyIO asyncio,1,15.202,18.158,7.660,10.499 apsw AnyIO asyncio uvloop,1,9.652,11.183,3.766,7.416 apsw AnyIO trio,1,16.670,19.597,10.028,9.570 aiosqlite,1,8.360,8.755,3.909,4.846 aiosqlite uvloop,1,5.358,5.635,1.867,3.769 apsw AsyncIO,2,4.287,4.441,1.684,2.758 apsw AsyncIO uvloop,2,2.895,2.987,0.797,2.190 apsw Trio,2,6.403,7.022,3.936,3.086 apsw AnyIO asyncio,2,8.175,9.702,3.919,5.783 apsw AnyIO asyncio uvloop,2,4.966,5.704,1.810,3.894 apsw AnyIO trio,2,8.683,10.140,5.081,5.059 aiosqlite,2,4.885,5.106,2.180,2.926 aiosqlite uvloop,2,3.288,3.454,1.129,2.326 apsw AsyncIO,16,1.231,1.271,0.324,0.947 apsw AsyncIO uvloop,16,0.977,0.993,0.162,0.831 apsw Trio,16,1.546,1.667,0.631,1.036 apsw AnyIO asyncio,16,1.671,1.877,0.570,1.306 apsw AnyIO asyncio uvloop,16,1.255,1.359,0.286,1.074 apsw AnyIO trio,16,1.776,1.990,0.738,1.253 aiosqlite,16,1.152,1.191,0.382,0.809 aiosqlite uvloop,16,0.868,0.888,0.191,0.697 apsw AsyncIO,64,0.847,0.868,0.145,0.723 apsw AsyncIO uvloop,64,0.740,0.744,0.072,0.671 apsw Trio,64,0.883,0.922,0.207,0.715 apsw AnyIO asyncio,64,0.984,1.044,0.212,0.832 apsw AnyIO asyncio uvloop,64,0.835,0.865,0.124,0.741 apsw AnyIO trio,64,0.983,1.056,0.249,0.807 aiosqlite,64,0.718,0.729,0.153,0.575 aiosqlite uvloop,64,0.619,0.623,0.102,0.521 apsw AsyncIO,512,0.702,0.715,0.071,0.644 apsw AsyncIO uvloop,512,0.668,0.668,0.053,0.615 apsw Trio,512,0.706,0.724,0.101,0.623 apsw AnyIO asyncio,512,0.746,0.767,0.090,0.676 apsw AnyIO asyncio uvloop,512,0.694,0.703,0.068,0.636 apsw AnyIO trio,512,0.736,0.762,0.105,0.657 aiosqlite,512,0.583,0.586,0.098,0.488 aiosqlite uvloop,512,0.561,0.563,0.084,0.479 apsw AsyncIO,"8,192",0.670,0.681,0.065,0.616 apsw AsyncIO uvloop,"8,192",0.652,0.653,0.042,0.610 apsw Trio,"8,192",0.703,0.720,0.089,0.630 apsw AnyIO asyncio,"8,192",0.711,0.724,0.078,0.646 apsw AnyIO asyncio uvloop,"8,192",0.697,0.707,0.064,0.643 apsw AnyIO trio,"8,192",0.733,0.761,0.098,0.663 aiosqlite,"8,192",0.564,0.566,0.091,0.475 aiosqlite uvloop,"8,192",0.542,0.543,0.076,0.467 apsw AsyncIO,"65.536",0.676,0.686,0.066,0.621 apsw AsyncIO uvloop,"65.536",0.658,0.659,0.051,0.608 apsw Trio,"65.536",0.700,0.713,0.093,0.620 apsw AnyIO asyncio,"65.536",0.724,0.738,0.090,0.648 apsw AnyIO asyncio uvloop,"65.536",0.698,0.706,0.072,0.634 apsw AnyIO trio,"65.536",0.743,0.769,0.115,0.654 aiosqlite,"65.536",0.560,0.562,0.092,0.471 aiosqlite uvloop,"65.536",0.542,0.543,0.073,0.470