COROUTINE is a C++ library for coroutine sequencing, which Keld Helsgaun published in May 1999.
Coroutine primitives & state
The facilities of the library are based on the coroutine primitives provided by the programming language SIMULA.
Resume(Coroutine *Next): resume the execution of a coroutine
Call(Coroutine *Next): start the execution of a coroutine
The execution procedure of
Callare the same: 1. suspending the current thread, 2. executing(start/resume) the target thread. The only difference between them is that the Call will Establish a relationship between caller and callee, whereas the Resume doesn’t.
Detach(): suspending/terminating the current coroutine and resume the caller of current coroutine
State transitions of a coroutine:
Two different implementations of the library are provided. The first implementation is based on the copying of stacks in and out of C++’s runtime stack. In the second implementation, all stacks of coroutines reside in the runtime stack (i.e., no stacks are copied).
All the following sections are based on the assumptions of the stack growing downwards.
StackBuffer on the heap to store/resume stack content and registers of a coroutine so that a coroutine could be suspended/continued by
Explain in Detail (Assuming stack grows down)
The struct of a coroutine struct (assuming stack grows down):
Procedure of coroutine switch (step 1,2 is suspending current coroutine, step 3,4 is resuming target coroutine):
What would happen if size of
StackBufferof the target coroutine is large than stack current used?
An undefined behavior would occur if the
memcpywere carried out without checking the size of stack current used. The program may jump to another statement far away; it may also crash with a segmentation fault; anything could happen.
To avoid it, we allocating memory on stack continually until the stack is large enough to hold
StackBufferof the target coroutine.
BufferSizealways equals to
High - Lowfor a coroutine?
BufferSize >= High - Low.
BufferSizeis the length of
StackBuffernever shrink, the
BufferSizestands for the maximum size of stack content the buffer ever hold. This is also the reason of using
High - Lowinstead of
BufferSizewhen the stack restore was carrying out.
In order to reduce the frequency of memory allocation & free.
Let all coroutine stacks share C++’s runtime stack, and jump between to achieve context switch.
Explain in Detail
Data structures (assuming stack grows down):
pred, suc : predecessor and successor in a doubly linked list of unused tasks
prev, next: pointers to the two adjacent tasks
StackSize: size of available stack memory, pre-allocated
Procedure of coroutine switch:
What if the stack requested at the coroutine initialization was insufficient?
Stack of the program will be disrupted, anything horrible could happen.
Can we prevent the coroutine from stack overflow?
As far as I know, we can’t.
How does we put the struct
taskto the start of the corresponding coroutine’s stack?
We doesn’t put the
taskstruct to the start of a coroutine’s stack. All we do it make sure there are at least 1 unclaimed
taskstruct on the stack all the time. When a new coroutine was called, the
taskwould assigned to it and the coroutine would running on the stack space just behind it.
prev/nextpointers are provided at the same time, whether one of the two groups could be removed?
prev/nextpointers are necessary which maintain the memory layout of our program, including:
- allocation new task
- merge free memory blocks on the stack
- split memory block on necessary.
pred/sucpointers are optional, they just speed up the procedure in searching for unused task.
What’s the usage of
*ToBeResumedintends to indicate a coroutine calling a specific coroutine on ending instead of returning directly, but not used for the library currently.
Why the state of coroutine (value of registers, stored in
jmp_buf) not recovered to initial state when a coroutine terminated?
There is no need to recover
jmp_bufafter the coroutine is terminated since the
Taskwould be marked as free. When the next new coroutine is fitted in this free
Task, the new state will be stored to
jmp_buf, execution of the program would not bother with the obsoleted state.
Comparison of the two implementations
(Arranged from the original report)
|memory overhead on AMD64 (per coroutine)
|256 bytes (size of Coroutine) + the largest size of stack the coroutine stored ever
|256 bytes (size of Task) + 48 bytes (size of Coroutine) + stack memory allocated but not used
|context switch overhead
|memory copying for the stack content + registers storing + registers restoring
|registers storing + registers restoring
|ease of use
|user does not need to bother about stack sizes for coroutines
|requires the maximum stack size of each coroutine to be specified
|restrictions in use
|on stack overflow, the program will crush or produce meaningless results without any notification
|each suspended coroutine must keep a copy of its stack content on the heap
|if maximum stack size specified is way too large, a lot of unnecessary memory is wasted
|way too complex (a bunch of setjmp and longjmp )
- COROUTINE (Keld Helsgaun)
- setjmp.h source code [glibc/setjmp/setjmp.h] - Woboq Code Browser
- setjmp.h source code [glibc/sysdeps/x86/bits/setjmp.h] - Woboq Code Browser
- setjmp.S source code [glibc/sysdeps/x86_64/setjmp.S] - Woboq Code Browser
- longjmp.c source code [glibc/setjmp/longjmp.c] - Woboq Code Browser
- __longjmp.S source code [glibc/sysdeps/x86_64/__longjmp.S] - Woboq Code Browser