Procedures, or subroutines, always start the execution at the beginning and at some point return. Procedures can call other procedures, leading to a nested calling structure. In the diagram to the right, vertical lines show the lifetime of a procedure invocation: bars mean that the procedure is executing.
Coroutines generalize procedures by allowing control to be transferred out of a coroutine at some point and then resumed again at that point. Coroutines, like procedures, can have local variables. These are preserved when control is transferred out of a coroutine. When a coroutine is created, it starts execution at the beginning.
The transfer from one coroutine to another can take different forms:
- Symmetric: In coroutine
A
, statementtransfer B
will transfer control coroutineB
; all coroutines are equal,B
can transfer back toA
or transfer toC
, etc. The transfer structure is arbitrary. - Asymmetric: Coroutine
A
calls coroutineB
initially; inB
statementsuspend
will return control toA
; inA
, statementresume B
resumes execution inB
at the point it suspended; coroutineA
is like a caller andB
like a callee, except thatB
will resume where it left and its state is preserved when suspending.
Coroutines run concurrently but are scheduled cooperatively, meaning that there are explicit points when control is transferred. Only one coroutine is executed at a time. By contrast, threads (and processes) are scheduled preemptively, meaning that after a certain time, control is transferred. Threads (and processes) can run in parallel.
Python Generators¶
In Python, asymmetric coroutines are used as generators, which are functions that when suspending, additionally yield a value. The keyword yield
is used for suspending and the function next
is used for resuming and obtaining the next yielded value:
def fib():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
f = fib()
f
The generator f
has it own state and runs concurrently with the main program:
next(f), next(f), next(f), next(f), next(f), next(f)
Generators can be used in for
statements for iterating over all generated elements. As fib()
generates arbitrarily many elements, this allows in principle iteration over an infinite sequence:
for x in fib():
print(x) # caution
Exercise: Modify fib
to take an additional integer parameter: fib(n)
yields only numbers that are less than n
and then terminates. This allows fib(n)
to be used like range(n)
in loops.
Generators can be used to avoid intermediate data structures to be constructed. Consider functions
unique(iterable)
, which takes aniterable
(list, tuple) and returns the elements in same order but without duplicates,filter(fn, iterable)
, which takes argumentfn
, a predicate, and returns the elements of the second argument,iterable
, that satisfyfn
.
def unique(iterable):
seen = set()
for e in iterable:
if e not in seen:
yield e
seen.add(e)
list(unique([1, 3, 4, 2, 1, 3]))
def filter(fn, iterable):
for e in iterable:
if fn(e):
yield e
def even(x):
return x % 2 == 0
list(filter(even, [1, 2, 3, 4, 5, 6]))
list(unique(filter(even, [0, 1, 0, 1, 2, 2, 0, 1])))
While we can think of filter
returning a list of even numbers, no such list is constructed in memory. Generator filter
and unique
are coroutines that run concurrently with the main program. (Note that Python has a built-in function filter
with similar functionality.)
Exercise: Implement unique
and filter
in Go as goroutines and compose them to achieve the same functionality as above. Note that while all even numbers are being transmitted over a channel, a sequence of all even numbers is not constructed in memory.
Goroutines¶
Goroutines are similar to symmetric coroutines in that control is transferred explicitly when sending and receiving. However, goroutines can be scheduled preemptively; if the goroutines do not access global variables, this is not observable. The current Go implementation avoids time slices when scheduling preemptively, instead inserts transfer
instructions at places where the code might take long, like in loops. Thus even if this appears to be preemptive scheduling, the Go implementation uses cooperative scheduling for efficiency.
Python async / await¶
Threads (and processes) are suitable for CPU-bound programs, as the computation can be spread among processors (cores). Coroutines are suitable for I/O-bound programs: as the programs mainly waits for different I/O actions, it can quickly switch among those. In Python
- a coroutine is declared with
async def
, await c
tranfers control to coroutinec
and resumes when control is transferred back,asyncio.run(c)
initiates coroutinec
,asyncio.gather(c0, c1, ...)
starts coroutinesc0, c1, ...
and waits for them to terminate.
Note: some features are only available in Python 3.7 or 3.8. Check with python3 -V
.
%%writefile asyncrequest.py
import asyncio
import time
async def request(i):
print("Request " + str(i))
await asyncio.sleep(1) # this could be an I/O operation
print("Done " + str(i))
async def main():
await asyncio.gather(request(0), request(1), request(2))
start = time.perf_counter()
asyncio.run(main())
elapsed = time.perf_counter() - start
print(f"executed in {elapsed:0.2f} seconds.")
!python3 asyncrequest.py
The asyncio
library has a class Queue
for communication between coroutines. This can be used to as a buffer between producers and consumers:
%%writefile asyncpc.py
import asyncio
import random
import time
async def makeitem(p: int) -> str:
return str(random.randint(0, 10)) + ' by ' + str(p)
async def produce(p: int, q: asyncio.Queue):
for _ in range(random.randint(0, 10)):
print("Producer " + str(p) + " sleeping")
await asyncio.sleep(random.randint(0, 3))
i = await makeitem(p)
await q.put(i)
print(i + " added")
async def consume(c: int, q: asyncio.Queue):
while True:
print("Consumer " + str(c) + " sleeping")
await asyncio.sleep(random.randint(0, 3))
i = await q.get()
print(i + " removed by " + str(c))
q.task_done()
async def main(np: int, nc: int): # number of producers, consumers
q = asyncio.Queue()
producers = [asyncio.create_task(produce(n, q)) for n in range(np)]
consumers = [asyncio.create_task(consume(n, q)) for n in range(nc)]
await asyncio.gather(*producers)
await q.join() # blocks until q is empty
for c in consumers:
c.cancel()
asyncio.run(main(3, 5))
!python3 asyncpc.py
Futures¶
A future a is the result of a function call that is not available when the function returns, but only later when it is needed. Thus execution of the function can be delayed; this is useful if the function is I/O bound and the result is not immediately needed.
While less common, futures can be expressed in Python with the concurrent.futures
library. Python allows to check if the function has termianted and the result is available:
from concurrent.futures import ThreadPoolExecutor
from time import sleep
def return_after_5_secs(message):
sleep(5)
return message
pool = ThreadPoolExecutor(3)
future = pool.submit(return_after_5_secs, ('hello'))
print(future.done())
sleep(5)
print(future.done())
print(future.result())