- rtshkmr's digital garden/
- Readings/
- Books/
- Fluent Python: Clear, Concise, and Effective Programming – Luciano Ramalho/
- Chapter 21. Asynchronous Programming/
Chapter 21. Asynchronous Programming
Table of Contents
async constructs
Objects supporting async constructs
includes other constructs enables by the
async/awaitkeywords: async generator functions, async comprehensions, async genexpsthese aren’t tied to
asyncio!
async libraries like
asyncio
What’s New in This Chapter #
A Few Definitions #
native coroutines
only defined using
async defdelegation from coroutine to coroutine only done using
await, not necessary that it MUST delegateclassic coroutines
actually a generator function that consumes data (data that is sent to it via
my_coro.send(data)calls)can delegate to other classic coroutines using
yield from. Ref “Meaning of yield from”no longer supported by
asyncioand doesn’t supportawaitkeywordgenerator-based coroutines (decorated using
@types.coroutine)a decorated generator function (
@types.coroutine), which makes the generator compatible withawaitkeywordthis is NOT supportd by
asyncio, but used in low-level code in other frameworks likeCurioandTrioasync generator (function)
generator function defined with
async defthat usesyieldin its bodyreturns an async generator object that provides
__anext__, which is a coroutine method to retrieve the next item.
An asyncio Example: Probing Domains #
- async operations are interleaved \(\implies\) the total time is practically the same as the time for the single slowest DNS response, instead of the sum of the times of all responses.
| |
loop.getaddrinfo()is the async version ofsocket.getaddrinfo()this returns a 5-part tuples of params to connect to the given address using a socket
asyncio.get_running_loopis designed to be used from within coroutines.If no running event loops, then it raises a
RuntimeError. The event loop should have already been started prior to execution reaching there.for coro in asyncio.as_completed(coros):the
asyncio.as_completed(coros)generator that yields coroutines that return the results of the coros passed to it in the order they are completed (not order of submission), similar tofutures.as_completedthe
await corois non-blocking because it’s guarded by theas_completedaboveif coro raises an exception, then it gets re-raised here
event loop:
started using
asyncio.run()IDIOM: for scripts, the common pattern is to make the
mainfunction a coroutine as well. The main coroutine is driven withasyncio.run()
Guido’s Trick to Read Asynchronous Code #
- squint and pretend the async and await keywords are not there. If you do that, you’ll realize that coroutines read like plain old sequential functions.
New Concept: Awaitable #
awaitexpression:uses the
yield fromimplementation with an extra step of validating its argumentonly accepts an awaitable
for\(\rightarrow\) iterables,await\(\rightarrow\) awaitablesfrom asyncio, we typically work with these awaitables:
a native coroutine object that we get by calling a native coroutine function e.g.
coro()wherecorois the coroutine functionasyncio.Taskthat we get when we create a task from a coroutine object toasyncio.create_task()remember that the
coro_obj = coro(), so the overall call is usuallyasyncio.creat_task(one_coro()), note the invocation of the native coroutine functionWhether to keep a handle to the task or not depends on whether we need to use it (e.g. to cancel the task or wait for it)
lower-level awaitables: (something we might encouter if we work with lower level abstractions)
an obj with
__await__method that returns an iterator (e.g.asyncio.Future, by the way,asyncio.Task<:asyncio.Future)objs written in other langs that use the Python/C API with a
tp_as_async.am_waitfunction, returning an iterator (similar to__await__method)soon to be deprecated: generator-based-coroutine objects
Downloading with asyncio and HTTPX #
| |
asynciodirectly supports TCP and UDP, without relying on external packagesres = await asyncio.gather(*to_do):Here, we pass the awaitables so that they can be gathered after completion, so that we get a list of results. Gathers in the order of submission of the coros.
AsyncClientis the async context manager that is used here. It’s a context manager that has async setup and teardown functions KIVIn this snippet of the
get_flagscoroutine:1 2 3 4 5async def get_flag(client: AsyncClient, cc: str) -> bytes: # <4> needs the client to make the http request url = f'{BASE_URL}/{cc}/{cc}.gif'.lower() resp = await client.get(url, timeout=6.1, follow_redirects=True) # <5> get method also returns a ClientResponse that is an async context manager, the network I/O is drive async via the =asyncio= event loop return resp.read() # <6> the body is just lazily fetched from the response object. This fully consumes the response body into memory.Implicit delegation of coroutines via async context managers:
getmethod of anhttpx.AsyncClientinstance returns aClientResponseobject that is also an asynchronous context manager.this is an awaitable that returns a
Responseby the way,
Responsecan also be used as a context manager when streaming! If it was, thenresp.read()would have been an an I/O operation that may yield to the event loop again if it’s attempting to drain the response body stream from the socket
the
awaityields control flow to the event loop while the network I/O happens (DNS resolution, TCP connect, handshake, waiting for response headers). During that suspension, other tasks can run.so by the end of point 5,
respis a properResponseobject and not a coroutine. The connection is ready.LANG_LIMITATION: However, asyncio does not provide an asynchronous filesystem API at this time like Node.js does.
there’s OS-level support for it (
io_uringon Linux), but nothing that supports this for python’s stdlib/asyncio
The Secret of Native Coroutines: Humble Generators #
classic vs native coroutines: the native ones don’t rely on a visible
.send()call oryieldexpressionsmechanistic model for async programs and how they drive async libraries:
Here, we see how in an async program:
a user’s function starts the event loop, scheduling an initial coroutine with
asyncio.runEach user’s coroutine drives the next with an
awaitexpression, which is when the control flow is yielded to the next coroutinethis forms a channel that enables communication between a library like HTTPX and the event loop.
awaitchain eventually reaches a low-level awaitable, which returns a generator that the event loop can drive in response to events such as timers or network I/O. The low-level awaitables and generators at the end of these await chains are implemented deep into the libraries, are not part of their APIs, and may be Python/C extensions.
awaitborrows most of its implementation fromyield from(classic coroutines), which also makes.sendcalls to drive coroutines.functions like
asyncio.gatherandasyncio.create_task, you can start multiple concurrentawaitchannels, enabling concurrent execution of multiple I/O operations driven by a single event loop, in a single thread.
The All-or-Nothing Problem #
had to replace I/O functions with their async versions so that they could be activated with
awaitorasyncio.create_taskif no choice, have to delegate to separate thread/proc
If you can’t rewrite a blocking function as a coroutine, you should run it in a separate thread or process
Asynchronous Context Managers via async with #
- asynchronous context managers: objects implementing the
__aenter__and__aexit__methods as coroutines.
Enhancing the asyncio Downloader #
- caution:
asynciovsthreadingasyncio can send requests faster, so more likely to get suspected of ddos by the HTTP server.
Using asyncio.as_completed and a Thread #
| |
- the
asyncio.semaphoreis being used as an asynchronous context manager so that the program as a whole is not blocked; only this coroutine is suspended when thesemaphorecounter is zero. - notice how we delegate the File I/O in point 5 to a threadpool provided by
asynciousingasyncio.to_thread, we justawaitit and yield the control flow to allow other threads to carry on
Throttling Requests with a Semaphore #
throwback to OS mods in school, semaphore numbered “mutex” \(\implies\) more flexibilty than just a binary mutex lock.
we can share the semaphore between multiple coroutines with a configured max number in order to throttle our Network I/O
why? because we should avoid spamming a server with too many concurrent requests \(\implies\) we need to throttle the Network I/O
previously, we did the throttling in a coarse manner by setting the
max_workersfor thedownload_manyin the demo code
Python’s Semaphores #
all the 3 different concurrency structures (
threading,multiprocessing,asyncio) have their own semaphore classesinitial value set @ point of creating the semaphore, semaphore is passed to every coroutine that needs to rely on it to synchronize
semaphore = asyncio.Semaphore(concur_req)semaphore decrements when we
awaiton.acquire()coroutine, increments when we callrelease()method (non blocking, not a coroutine)if not ready (count
= 0), =.acquire()suspends the awaiting coroutine until some other coroutine calls.release()on the same Semaphore, thus incrementing the counter.asyncio.Semaphoreused as an async context manager:instead of using
semaphore.acquire()andsemaphore.release()directly, we can rely on the async context manager to acquire (Semaphore.__aenter__coroutine method await for.acquire()) and release the semaphore (Semaphore.__aexit__calls.release())this guarantees that no more than
concur_reqinstances ofget_flagscoroutines will be active at any time
Making Multiple Requests for Each Download #
our objective now is to make 2 callbacks per country. In a sequential pattern, it would have been to just call one after the other. The async version isn’t directly the same.
We can drive the asynchronous requests one after the other, sharing the local scope of the driving coroutine.
here’s the v3 using asyncio
some changes:
new coroutine
get_countryis a new coroutine for the .json fetchdownload_onewe now useawaitto delegate toget_flagand the newget_country
| |
- NOTE: point 1 & 2 in
download_one: it’s good practice to hold semaphores and locks for the shortest possible time.
One challenge is to know when you have to use
awaitand when you can’t use it.The answer in principle is easy: you await coroutines and other awaitables, such as
asyncio.Taskinstances.Reality is that the APIs can be confusingly named e.g.
StreamWriter
Delegating Tasks to Executors #
problem: unlike NodeJS where ALL I/O has async APIs, python doesn’t have async APIs for all I/O. Notably, File I/O is NOT async.
This means that in our async code, file I/O can severly bottleneck performance if the main thread is blocked.
delegating to an executor is a good idea then
we can use
asyncio.to_threade.g.await asyncio.to_thread(save_flag, image, f'{cc}.gif')under the hood, it uses
loop.run_in_executor, so the equivalent to the above statement would be:loop = asyncio.get_running_loop() # gets a reference to the event loop loop.run_in_executor(None, save_flag, image, f'{cc}.gif') # 1st Arg: Executor to use. None => default => ThreadPoolExecutor (always available in asyncio event loop)when using
run_in_executor, the 1st Arg is the Executor to use.None\(\implies\) default \(\implies\)ThreadPoolExecutor(always available in asyncio event loop)CAUTION: this accepts positional args, have to use
functool.partialif we wish to use kwargs. Or just use the newerasyncio.to_threadwhich will accept kwargs.IDIOM: this is a common pattern in async APIs:
wrap blocking calls that are implementation details in coroutines using run_in_executor internally. That way, you provide a consistent interface of coroutines to be driven with await, and hide the threads you need to use for pragmatic reasons.
loop.run_in_executor’s explicitExecutorallows us to use process-based approach for CPU-intensive tasks so that it’s a different python process and we avoid the GIL contention.TRICK / IDIOM: prime the
ProcessPoolExecutorin thesupervisorand then pass it to the coroutines that need it to reduce the effect of the high startup costs
WARNING / LANG_LIMITATION: Coroutines that use executors give the pretense of cancellation because the underlying thread/proc has no cancellation mechanism.
Using
run_in_executorcan produce hard-to-debug problems since cancellation doesn’t work the way one might expect. Coroutines that use executors give merely the pretense of cancellation: the underlying thread (if it’s aThreadPoolExecutor) has no cancellation mechanism.For example, a long-lived thread that is created inside a run_in_executor call may prevent your asyncio program from shutting down cleanly:
asyncio.runwill wait for the executor to fully shut down before returning, and it will wait forever if the executor jobs don’t stop somehow on their own.My greybeard inclination is to want that function to be
namedrun_in_executor_uncancellable.
Writing asyncio Servers #
A FastAPI Web Service #
| |
endpoint handlers can be coros or plain functions like we see here.
there’s no
mainfunction, it’s loaded and driven by the ASGI server (uvicorn).we don’t have return type hints here because we allow the pydantic schema to do the job
this is like schema casting when defining changesets in elixir
model is declared in this parameter instead of as a function return type annotation, because the path function may not actually return that response model but rather return a dict, database object or some other model, and then use the response_model to perform the field limiting and serialization.
response_model in FastAPI + Pydantic plays the role of both serialization and field-whitelisting — taking arbitrary Python objects/dicts and producing clean, predictable outputs according to the model definition
by the way the inverted index was implemened like so:
| |
An asyncio TCP Server (no deps, just asyncio streams) #
- this demo is one where we use plain TCP to comms with a telnet/netcat client and using
asynciodirectly without any external dependencies!
| |
IDIOM @
finderpoint number 2;Use
functools.partialto bind that parameter and obtain a callable that takes the reader and writer. Adapting user functions to callback APIs is the most common use case forfunctools.partialhow multiple clients can be served at once:
While the event loop is alive, a new instance of the finder coroutine will be started for each client that connects to the server.
how the keyboard interrupt works
the interrupt signal will cause the raising of
KeyboardInterruptexception from within thesupervisor::server.serve_forever.event loop dies also.
This propagates out into the
mainfunction that had been driving the event loop.GOTCHA:
StreamWriter.writeis not a coro,StreamWriter.drainis a corosome of the I/O methods are coroutines and must be driven with await, while others are simple functions. For example,
StreamWriter.writeis a plain function, because it writes to a buffer. On the other hand,StreamWriter.drain— which flushes the buffer and performs the network I/O — is a coroutine, as isStreamReader.readline—but notStreamWriter.writelines!
Asynchronous Iteration and Asynchronous Iterables and using async for #
async with\(\implies\) works with Async Context Managersasync for\(\implies\) asynchronous iterables:__aiter__that returns an async iterator BUT__aiter__is NOT as coro method, it’s a regular method
async iterator provides
__anext__coro method that returns an awaitable, usually a coro object. Just like the sync counterparts, expected to implement__aiter__which trivially returnsselfRemember same point about NOT mixing iterables and iterators
example:
aiopgasync postgres driver :1 2 3 4 5 6 7 8 9 10async def go(): pool = await aiopg.create_pool(dsn) async with pool.acquire() as conn: async with conn.cursor() as cur: # the cursor is the async iterator here await cur.execute("SELECT 1") ret = [] async for row in cur: # important to NOT block the event loop while cursor may be waiting for additional rows ret.append(row) assert ret == [(1,)]- By implementing the cursor as an asynchronous iterator,
aiopgmay yield to the event loop at each__anext__call, and resume later when more rows arrive from PostgreSQL.
- By implementing the cursor as an asynchronous iterator,
Asynchronous Generator Functions #
Implementing and Using an async generator #
Implementing an Async Iterator
class-implementation for async iterator: implement a class with
__anext__and__aiter__simpler way to implement an async iterator: as a generator function that is async \(\implies\) async generator
write a function declared with
async defand useyieldin its body. This parallels how generator functions simplify the classicIteratorpattern.
Usage of async generators:
Async generators can be used with
async for\(\Leftarrow\) driven byasync for:- as a block statement
- as async comprehensions
We can’t use typical for loops because async generators implement
__aiter__and NOT__iter__
Demo example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30import asyncio import socket from collections.abc import Iterable, AsyncIterator from typing import NamedTuple, Optional class Result(NamedTuple): # <1> convenience: easier to read and debug domain: str found: bool OptionalLoop = Optional[asyncio.AbstractEventLoop] # <2> typealias to clean up the hinting below async def probe(domain: str, loop: OptionalLoop = None) -> Result: # <3> if loop is None: # no current event loop handle in scope loop = asyncio.get_running_loop() try: await loop.getaddrinfo(domain, None) except socket.gaierror: return Result(domain, False) return Result(domain, True) async def multi_probe(domains: Iterable[str]) -> AsyncIterator[Result]: # <4> Async Generator function returns an async generator object, that's why it's typed like that loop = asyncio.get_running_loop() coros = [probe(domain, loop) for domain in domains] # <5> list of proble coros for coro in asyncio.as_completed(coros): # <6> this is a classic generator, that's why we can drive it using =for= and not =async for= result = await coro # <7> guarded by the =as_completed= not to worry that it will be actually blocking. yield result # <8> this is what makes multiproble an async generator
The result is yielded by
multi_probe, which is what makesmulti_probean async generatorShortcut to the for loop:
1 2for coro in asyncio.as_completed(coros): yield await coroTRICK: The
.invalidtop-level domain is reserved for testing.see elaboration here:
Yes, the statement is **true**. The **`.invalid` top-level domain (TLD) is reserved specifically for testing and use in examples or scenarios where a guaranteed invalid domain is needed**. It is defined as a special-use domain name by the Internet Engineering Task Force (IETF) in [RFC 2606 (1999)](https://www.rfc-editor.org/rfc/rfc2606.html) and officially reserved by the Internet Assigned Numbers Authority (IANA). ### Key points on `.invalid` TLD reservation: - The `.invalid` TLD **cannot appear in the global DNS root zone** to avoid conflicts with existing or future valid TLDs. - It is intended to be used in tests, documentation, or example scenarios where domain names must be constructed clearly as invalid or guaranteed to not resolve. - Alongside `.invalid`, other reserved TLDs for similar "safe" use are `.test`, `.example`, and `.localhost`. - Using `.invalid` in software or test settings helps catch or demonstrate domain resolution failures without accidentally affecting real domains. - Because of this reservation, any use of `.invalid` as a real domain name should not expect it to resolve on the public internet. ### Supporting authoritative references: - **RFC 2606 (Reserved Top Level DNS Names)** states: > "`.invalid` is intended for use in online construction of domain names that are sure to be invalid and which it is obvious at a glance are invalid." - [Wikipedia: .invalid](https://en.wikipedia.org/wiki/.invalid) also confirms this reservation by IETF for such use. - IANA maintains `.invalid` as one of the reserved special-use domain names unlikely to ever be delegated. ### Summary table | Domain | Purpose | Delegated in global DNS? | Use Case | |----------------|-----------------------------------|-------------------------|-------------------------------------------------| | `.invalid` | Reserved for invalid/test domains | No | Testing, documentation, avoiding domain clashes | | `.test` | Reserved for testing | No | Test environments | | `.example` | Reserved for examples | No | Documentation and examples | | `.localhost` | Reserved for loopback services | No | Localhost network reference | *** In conclusion, your quoted **TRICK** that `.invalid` is a top-level domain reserved for testing is **correct and reflects Internet standards**. If you want, I can provide more background on reserved TLDs, best practices for using them in networking or development, or how they differ from other special-use or reserved names. Just let me know! [1] https://en.wikipedia.org/wiki/.invalid [2] https://skynethosting.net/blog/what-is-invalid-tlds/ [3] https://datatracker.ietf.org/doc/rfc2606/ [4] https://www.rfc-editor.org/rfc/rfc2606.html [5] https://domaintyper.com/invalid-domain [6] https://stackoverflow.com/questions/4128351/is-there-a-valid-domain-name-guaranteed-to-be-unreachable [7] https://circleid.com/posts/20090618_most_popular_invalid_tlds_should_be_reserved [8] https://news.ycombinator.com/item?id=15268822 [9] https://en.wikipedia.org/wiki/Top-level_domainUsing the async generator:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24#!/usr/bin/env python3 import asyncio import sys from keyword import kwlist from domainlib import multi_probe async def main(tld: str) -> None: tld = tld.strip('.') names = (kw for kw in kwlist if len(kw) <= 4) # <1> domains = (f'{name}.{tld}'.lower() for name in names) # <2> print('FOUND\t\tNOT FOUND') # <3> print('=====\t\t=========') async for domain, found in multi_probe(domains): # <4> async iterate over the async generator indent = '' if found else '\t\t' # <5> print(f'{indent}{domain}') if __name__ == '__main__': if len(sys.argv) == 2: asyncio.run(main(sys.argv[1])) # <6> else: print('Please provide a TLD.', f'Example: {sys.argv[0]} COM.BR')
Async generators as context managers #
Generators (sync and async versions) have one extra use unrelated to iteration: they can be made into context managers.
We can use the
@asynccontextmanagerdecorator within thecontextlibmoduleSimilar to its sync counterpart
@contextmanager1 2 3 4 5 6 7 8 9 10 11 12 13 14from contextlib import asynccontextmanager @asynccontextmanager async def web_page(url): # the function to be decorated has to be an async generator loop = asyncio.get_running_loop() data = await loop.run_in_executor( None, download_webpage, url) # we run in a separate thread in case this is a blocking function; keeps out event loop unblocked yield data # this makes it an async generator await loop.run_in_executor(None, update_stats, url) async with web_page('google.com') as data: process(data)Outcome
similar to the sync version, all lines before the
yieldbecome the entry code,__aenter__coro method of the async context manager that is built by the decorator. So, when control flow comes back to this, the value ofdatawill be bound to thedatatarget variable that is associated with the context manager below.All lines after
yieldbecome the__aexit__coro method. Another possibly blocking call is delegated to the thread executor.
Asynchronous generators versus native coroutines #
Similarities
async deffor both
Differences
async generator has a
yieldin its body but not a native coroutineasync generator can ONLY have
emptyreturn statements BUT a naive coro may return a value other thanNoneAsync generators are NOT awaitable, they are iterables so are driven by
async foror async comprehensionsmeanwwhile, native coros are awaitable. Therefore:
can be driven by
awaitexpressionscan be passed to
asynciofunctions that consume awaitables (e.g.create_task)
Async Comprehensions and Async Generator Expressions #
Async generator expressions #
Here’s how we can define and use one:
| |
- an asynchronous generator expression can be defined anywhere in your program, but it can only be consumed inside a native coroutine or asynchronous generator function.
Async comprehensions #
we can have the usual kind of comprehensions done async! just need to make sure that it’s within an async context i.e. within an
async defor within an async REPL console.async listcomps:
result = [i async for i in aiter() if i % 2]which is actually similar toasyncio.gather()just a little less flexible. gather function allows us to do better exception handling.async dictcomps:
{name: found async for name, found in multi_probe(names)}async setcomps:
{name for name in names if (await probe(name)).found}the extra parentheses is because
__getattr__operator,.has operator precedence there
async Beyond asyncio: Curio #
async/awaitconstructs are library agnosticcurio blogdom demo example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28#!/usr/bin/env python3 from curio import run, TaskGroup import curio.socket as socket from keyword import kwlist MAX_KEYWORD_LEN = 4 async def probe(domain: str) -> tuple[str, bool]: # <1> no need to receive event loop try: await socket.getaddrinfo(domain, None) # <2> getaddrinfo is top-level fn of the curio.socket, it's not a method of a loop object like it is in asyncio except socket.gaierror: return (domain, False) return (domain, True) async def main() -> None: names = (kw for kw in kwlist if len(kw) <= MAX_KEYWORD_LEN) domains = (f'{name}.dev'.lower() for name in names) async with TaskGroup() as group: # <3> core concept in curio monitors and controls a group of tasks (coros) for domain in domains: await group.spawn(probe, domain) # <4> we spawn to start a coro, managed by a particular TaskGroup instance. Coro is wrapped by a Task within the TaskGroup async for task in group: # <5> yields as it's completed, like =as_completed= domain, found = task.result mark = '+' if found else ' ' print(f'{mark} {domain}') if __name__ == '__main__': run(main()) # <6> sensible syntaxTaskGroupCurio
TaskGroupis an asynchronous context manager that replaces several ad hoc APIs and coding patterns inasyncio.above we saw how we can just drive the group and we get things in the order of completion, analogous to
asyncio.as_completedwe can also gather them all easily:
1 2 3async with TaskGroup(wait=all) as g: await g.spawn(coro1) await g.spawn(coro2)TaskGroupas a support for structured concurrency:adds a constraint to concurrent programming:
a group of async tasks should have a single entry and single exit point.
as an asynchronous context manager, a
TaskGroupensures that all tasks spawned inside are completed or cancelled, and any exceptions raised, upon exiting the enclosed block.just like how structured programming advised against the use of
GOTOstatements
seems like asyncio has some partial support for structured concurrency since 3.11, e.g. with TaskGroups…
Curio also provides a UniversalQueue that can be used to coordinate the work among threads, Curio coroutines, and asyncio coroutines.
Type Hinting Asynchronous Objects #
the return type of native coroutine == the type of result it spits out when you await on it
different from annotations for classic coroutines, where it’s the 3-paramed Generator type
3 points about typing:
all the async objects are all covariant on the first type parameter, which is the type of the items yielded from these objects. Aligns with the “producer” / output types being covariant.
AsyncGeneratorandCoroutineare contravariant on the second to last parameter. That’s because they are output types and output types are contravariant.AsyncGeneratorhas no return typewhen we saw
typing.Generator, we realised how we could return values by hacking theStopIteration(value)and that’s how generator-enhanced classic coroutines were hacked out, which is why we could make generators operate as classic coroutines and supportyield fromNo such thing for
AsyncGeneratorAsyncGeneratorobjects don’t return values, and are completely separate from native coroutine objects, which are annotated withtyping.Coroutine
How Async Works and How It Doesn’t #
Running Circles Around Blocking Calls #
- IO is god damn slow, if we async in a disciplined manner then our servers would be high-performance
The Myth of I/O-Bound Systems #
there are “I/O bound functions” but no “I/O bound systems”
any nontrivial system will have CPU-bound functions, dealing with them is the key to success in async programming
Avoiding CPU-Bound Traps #
- should have performance regression tests
- important with async code, but also relevant to threaded Python code because of the GIL
- we should not OBSERVE slowdown (by that time it’s too late) because the direct performance hit bad patterns are less likely to be humanly observable (until it’s too late).
What to do if we see a CPU-hogging bottleneck: #
- delegate task to a python proc pool
- delegate task to external task queue
- avoid GIL constraints, rewrite code in Cython, C, Rust – anything that interfaces with the Python/C API
- choose to do nothing
Chapter Summary #
- don’t block the event loop, delegate to different processing unit (thread, proc, task queue)