Coroutines (C++20)

Coroutines, in general, are functions that can be interrupted and resumed. They can greatly simplify writing event-driven programs, they are almost unavoidable for work-stealing thread pools, and they make writing asynchronous I/O and other asynchronous code much easier.

The Foundations of Coroutines

Coroutines are functions that can suspend and resume their execution while keeping their state.

Regular functions are so called subroutines. They are a special case of coroutines:

The thing we can see from the figure is that the coroutine decides when it suspends and this means we don’t need locks. It makes coroutine nice for cooperative multitasking.

There are two styles of coroutines:

Stackful coroutines, aka fibers:
- Similar to functions wherein their state is allocated on the stack.
- More powerful and flexible than stackless coroutine.
- Can be suspended anywhere, at an arbitrary depth of function calls. But this flexibility has a high cost in memory and especially runtime.
Stackless coroutine (C++20 supports):
- No corresponding stack allocations, their state is stored on the heap.
- More efficient and simpler than stackful coroutine.
- Can be suspended only from the top level of the coroutine function.

Stack Frame

A regular C++ function always has a corresponding stack frame. This stack frame exists for as long as the function is running, and that is where all the local variables and other states are stored. For example:

void g() {/* ... */}
void f() {
    // ...
    g();
    // ...
}

Stack frames of regular functions. (© The Art of Writing Efficient Programs, Fedor G. Pikus.)

Coroutine Runtime Steps

In contrast, the state of the stackless coroutine is not stored on the stack but on the heap called activation frame. The activation frame is an object that acts as a smart pointer. When a coroutine suspends itself, parts of the state that are necessary to resume it are stored in the activation frame.

We use a simple example to illustrate the concept:

void g() {/* ... */}
void h(H) {
    // ...
    H.resume(); // Not the real syntax
}

void coro() { // coroutine
    // ...
    g();
    // ...
}

void f() {
    // ...
    std::coroutine_handle<???> H; // Not the real syntax
    coro();
    h(H); // Called after coro() is suspended    
    // ...
}

Coroutine call. (© The Art of Writing Efficient Programs, Fedor G. Pikus.)

The function f() creates a coroutine handle object, which owns the activation frame. The activation frame persists as long as the handle is not destroyed.
f() calls coro(). The coroutine stores on the stack the address where it would return if it is suspended.
coro() calls another function g(), which allocates the stack frame of g() on the stack.

NOTE

At this point, the coroutine can no longer suspend itself. This is the key difference between stackful and stackless coroutines.
Function g() runs and eventually returns , which destroys its stack. Then coroutine can suspend itself now.

Let us assume that the coroutine suspends after g() return. The caller continues its execution and may call other functions h(). The memory allocations now look as follows:

Coroutine is suspended, execution continues. (© The Art of Writing Efficient Programs, Fedor G. Pikus.)

There are something happened during this period:

Parts of the state that are necessary to resume it are stored in the activation frame.
The stack frame of the coroutine is then destroyed, and the control returns to the caller, to the point where the coroutine was called.

Later, h() resume coro() by accsessing to the coroutine handle H:

Coroutine is resumed from a different function.

Summary

Here is a summary of what is important to know about C++20 coroutines:

Coroutines are functions that can suspend themselves. This is different from the OS suspending a thread: suspending a coroutine is done explicitly by the programmer (cooperative multitasking).
Unlike regular functions, which are associated with stack frames, coroutines have handle objects. Coroutine state persists as long as the handle is alive.
After the coroutine is suspended, the control is returned to the caller, which continues to run the same way as if the coroutine had completed.
The coroutine can be resumed from any location; it does not have to be the caller itself. Furthermore, the coroutine can even be resumed from a different thread (we will see an example later in this section). The coroutine is resumed from the point of suspension and continues to run as if nothing happened (but may be running on a different thread).

Syntax in C++

There is no special syntax for declaring a coroutine (It’s not a object but rather a language feature). What makes a function into coroutines is the use of the suspend operator co_wait or co_yield, co_await.

Keyword	Action	State
`co_yield`	Output or Input	Suspended
`co_return`	Output	Ended
`co_await`	Output or Input	Suspended

TIP

Constructing coroutine directly is very verbose, repetitive, and contains a lot of boilerplate code. In practice, everybody who uses coroutines should use one of several available coroutine libraries.

`ReturnType`

We have to implement the return type first.

//------------------------------------------------------------------------------
struct MyReturnType
{
    struct promise_type {};     //* compile needs to look for promise_type in MyReturnType.
};

//--------------------------------------Or--------------------------------------
struct MyReturnType {};

struct std::coroutine_traits<MyReturnType, ...>
{
    struct promise_type {};
};

//------------------------------------------------------------------------------
//* Use in user application.
MyReturnType HelloCoroutine()
{
    co_return;      //* This makes it different from normal function.
}

🈯Often, you can use type alias for something else you like:

template <typename T> struct generator {
    struct something_else { /*...*/ };
    using promise_type = something_else;
};

ReturnType. (©Andreas Weisicient)

`promise_type`

If we see this name for the first time, we might think of another C++ feature: Futures & Promises in threading library. promise_type is smililar to it. We can think of it as std::promise but keep in mind they work differently.

struct promise_type
{
    //* {Required!} This function constructs a new wrapper from a handle that is, in turn, constructed from a promise object.
    //* It is called by the compiler to get the result of the coroutine.
    MyReturnType get_return_object() { return {};}

    //* {Required!} It is called if the coroutine ends without `co_return`.
    //* Either of the two implementations needs to be provided, not both.
    void return_void()  /   void return_value(T);

    //* {Required!} It is called if the coroutine throws an exception that escapes from its body.
    void unhandled_exception() {}

    //* {Required!} It is called when the coroutine first start to execute.
    std::suspend_{always/never} initial_suspend() noexcept {return {};}
    
    //* {Required!} It is called after the coroutine produces its last result via `co_return`. It cannot be suspended afterward.
    std::suspend_{always/never} final_suspend() noexcept {return {};}
}

ReturnType. (©Andreas Weisicient)

`coroutine_handle<>`

struct std::coroutine_handle<promise_type>
{
    //* Resumes the execution of the coroutine to which *this refers,
    //* or does nothing if the coroutine is a no-op coroutine.
    void resume() const;


    //* Destroys the coroutine state of the coroutine to which *this refers, 
    //* or does nothing if the coroutine is a no-op coroutine.
    void destroy() const;


    //* Creates a coroutine_handle from the promise object of a coroutine.
    static coroutine_handle from_promise(promise_type&);


    //* Return the reference to a promise_type.
    promise_type& promise() const;
};

Coroutine Handle. (©Andreas Weisicient)

`Awaitable`

The easiest way to think of the Awaitable is that it’s a type I can co_await.


struct MyAwaitable
{
    //* It is called after the coroutine is suspended.
    //* If it returns true, then the result of the coroutine is ready, and it is not really necessary to suspend it.
    //* In practice, it almost always returns false, which leads to suspension of the coroutine.
    bool await_ready() {return false;}

    //* It will execute shortly before the coroutine goes into the suspension.
    //* (Megadrive doubt?) It can have several different return types and values. If it returns void, the coroutine is suspended, and the control is returned to the caller or resumer.
    void await_suspend(std::coroutine_handle<>) {}

    //* It will execute right before the coroutine wakes up.
    void await_resume() {}
};

Coroutine diagram. (©Andreas Weis)

Workflow Illustration

Getting data out of a coroutine

We will now illustrate the workflow when coroutine producing some data. Then the call is able to access data in some way.

The only thing that coroutine hass access to is the awaitable. This is the code look likes:

MyReturnType f1() {
    //...
    co_await TheAnswer{42};
}

TheAnswer::TheAnswer(int v) : value_(v) {}

Getting data out of a coroutine. Step 1 (©Andreas Weis)

Through the await_suspend call, we can get to the coroutine_handle then finally get into the promise:

struct SomePromise {
    int value;
};

void TheAnswer::await_suspend(std::coroutine_handle<SomePromise> h)
{
    h.promise().value = value_;
}

Getting data out of a coroutine. Step 2 (©Andreas Weis)

Now the data lived in the promise, we are going to access it from the caller side through the ReturnType.

struct MyReturnType {
    //...
    std::coroutine_handle<MyCoroutine> handle_;
    int getAnswer () {
        return handle_.promise().value ;
    }
}

int main() {
    MyReturnType c1 = f1();
    std::cout << "The answer is " << c1.getAnswer();
}

Getting data out of a coroutine. Step 3 (©Andreas Weis)

Getting data into a coroutine

Now the caller puts the coroutine to sleep at some point. Then, the caller provied some data. Once the corouine wakes up, it can access the data.

We have ReturnType created and eventually it suspends, control return back to Main. Then caller provides the value 42 as data.

void MyReturnType::provide(int the_answer) {
    handle_.promise().value = the_answer ;
    handle_.resume();
}

MyReturnType f1 () {
    int the_answer = co_await OutsideAnswer {};     //* Just put the coroutine to sleep.
}

int main() {
    MyReturnType c1 = f1();
    c1.provide(42);
}

Getting data into a coroutine. Step 1 (©Andreas Weis)

Through the await_suspend call, we can get to the coroutine_handle then finally get into the promise:

struct OutsideAnswer {

    bool await_ready() {return false;}

    //* we just store the handle.
    void await_suspend (std::coroutine_handle<promise> h) {
        handle_ = h;
    }

    //* we get the data before the coroutine continuing execution.
    int await_resume () {
        return handle_.promise().value;
    }

    std::coroutine_handle<promise> handle_;
};

Getting data into a coroutine. Step 2 (©Andreas Weis)

Yielding values

If we have already know the awaitable, co_yield is very much like the syntax suger that allows you no having to implement the type of awaitable.

Getting data into a coroutine. Step 1 (©Andreas Weis)

//--------------------------------------------------------------------
FiboGenerator makeFiboGenerator () {
    int i1 = 1;
    int i2 = 1;
    while (;;) {
        co_await NewNumberAwaitable {i1};
        i1 = std::exchange(i2 , i1 + i2);
    }
}
//------------------------------Or------------------------------------
FiboGenerator makeFiboGenerator () {
    int i1 = 1;
    int i2 = 1;
    while (;;) {
        co_yield i1;
        i1 = std::exchange(i2 , i1 + i2);
    }
}

struct promise_type {
    // ...
    int value;
    std::suspend_always yield_value(int i) {
        value = i;
        return {}
    }
};
//--------------------------------------------------------------------

int main()
{
    FiboGenerator fibo = makeFiboGenerator ();
}

Examples

For education purpose, We would like you to understand the coroutine code at the C++ level instead of at the level of abstractions presented by a particular library. Here are examples written in bare C++.

Hello World

//* Wrapper type Chat containing the promise type
Chat ForFun()
{
    //* Calls promise_type.yield_value()
    co_yield std::string("Hello!\n");

    //* Calls promise_type.await_transform
    std::cout << co_await std::string{};

    //* Calls promise_type.return_value
    co_return std::string("Here!\n");
}


int main()
{   
    //* Create the coroutine
    Chat chat = ForFun();

    //* Trigger the machine
    std::cout << chat.listen();

    //* Send data into the machine
    chat.answer("Where r u?\n");

    //* Wait for more data from the coroutine.
    std::cout << chat.listen();
}

struct Chat
{
    struct promise_type
    {
        std::string msg_in_{}, msg_out_{};
        
        Chat get_return_object() {return Chat(this);}
        std::suspend_always initial_suspend() noexcept {return {};}
        std::suspend_always final_suspend() noexcept {return {};} 
        void unhandled_exception() noexcept {}

        std::suspend_always yield_value(std::string msg) noexcept
        {
            msg_out_ = std::move(msg);
            return {};
        }

        auto await_transform(std::string) noexcept
        {
            struct awaiter {
                promise_type& pt;
                constexpr bool await_ready() const noexcept {return true;}
                std::string await_resume() const noexcept {return std::move(pt.msg_in_);}
                void await_suspend(std::coroutine_handle<>) const noexcept {}
            };
        
            return awaiter{*this};
        }

        void return_value(std::string msg) noexcept {msg_out_ = std::move(msg);}

    };

    std::coroutine_handle<promise_type> coro_handle_;

    explicit Chat(promise_type* p) : coro_handle_{std::coroutine_handle<promise_type>::from_promise(*p)} {}

    Chat(Chat&& rhs) : coro_handle_ {std::exchange(rhs.coro_handle_, nullptr)} {}

    ~Chat() {
        if(coro_handle_) {coro_handle_.destroy();}
    }

    std::string listen() {
        if(not coro_handle_.done()) {coro_handle_.resume();}
        return std::move(coro_handle_.promise().msg_out_);
    }

    void answer(std::string msg){
        coro_handle_.promise().msg_in_ = msg;
        if(not coro_handle_.done()) { coro_handle_.resume();}
    }

};

About this diagram:

On the left side, we have user code. This is how we use corotine in application.
On the middle, There is a wrapper type in the middle. It contains the promise type.
1. The promise type may or may not contain additional variables.
2. Member function listen and answer are also optional.
3. std::coroutine_handle is also optional. In our case, we use it for data In/Out.
On the right size, There is compiler implementation of coroutine. The heap, Coroutine Frame, stores value and status. In our case, we don’t store local value in our coroutine.
The promise_type must be a nested type of Chat (or, in general, any type returned by the coroutine). And it must be named promise_type, otherwise, the program will not compile.

🈯Often, you can use type alias for something else you like:
```
template <typename T> struct generator {
    struct something_else { /*...*/ };
    using promise_type = something_else;
};
```

INFO

yield_value(): It is invoked every time the operator co_yield is called; its argument is the co_yield value. Storing the value in the promise object is how the coroutine usually passes the results to the caller.

There are more optional boilerplate codes for coroutine. See the cppreference.

Interleaving two `std::vector`

//* Ther wrapper class.
struct Generator
{
    //* The promise type
    struct promise_type {
        int val_{};

        Generator get_return_object() {return Generator(this);}
        std::suspend_never initial_suspend() noexcept {return {};}
        std::suspend_always final_suspend() noexcept {return {};}
        std::suspend_always yield_value(int v) {
            val_ = v;
            return {};
        }
        
        void return_void() {}
        void unhandled_exception() {}
    };

    std::coroutine_handle<promise_type> coro_handle_;

    explicit Generator(promise_type* p) : coro_handle_{std::coroutine_handle<promise_type>::from_promise(*p)} {}

    Generator(Generator&& rhs) : coro_handle_ {std::exchange(rhs.coro_handle_, nullptr)} {}

    ~Generator() {
        if(coro_handle_) {coro_handle_.destroy();}
    }

    int value() const {return coro_handle_.promise().val_;}

    bool finished() const {return coro_handle_.done();}

    void resume(){
        if(not finished()) { coro_handle_.resume();}
    }
};


Generator Interleaved(std::vector<int> vec_a, std::vector<int> vec_b)
{
    auto lambda = [](std::vector<int>& v) -> Generator {
        for(const auto& e : v) {co_yield e;}
    };

    auto x = lambda(vec_a);
    auto y = lambda(vec_b);

    while(not x.finished() or not y.finished())
    {
        if(not x.finished()) {
            co_yield x.value();
            x.resume();
        }

        if(not y.finished()) {
            co_yield y.value();
            y.resume();
        }
    }
};

#include <coroutine>
#include <utility>
#include <vector>
#include <iostream>

int main()
{
    std::vector vec_a{2, 4, 6, 8};
    std::vector vec_b{1, 3, 5, 7, 9, 11};

    Generator g{Interleaved(std::move(vec_a), std::move(vec_b))};

    while(not g.finished()) {
        std::cout << g.value() << " ";
        g.resume();
    }
}

Lazy Generator

A lazy generator is a generator that computes elements on demand, as it is called.

#include <coroutine>
#include <iostream>

template <typename T> struct generator 
{
    // nested data structure; that is the coroutine handle. It  must be called promise_type,
    // otherwise, the program will not compile.
    struct promise_type   //! This has absolutely nothing to do with C++11 std::promise
    {
        T value_ = -1;

        generator get_return_object() {
            using handle_type=std::coroutine_handle<promise_type>;
            return generator{handle_type::from_promise(*this)};
        }
        
        std::suspend_never initial_suspend() { return {}; }
        std::suspend_never final_suspend() noexcept { return {}; }
        
        void unhandled_exception() {}
        
        std::suspend_always yield_value(T value) {
            std::cout << "suspend " << value << " was " << value_ << std::endl;
            value_ = value;
            return {};
        }

        void return_void() noexcept {}
    };

    std::coroutine_handle<promise_type> h_;
};

generator<int> coro()
{
    for (int i = 0;; ++i) {
        co_yield i;       // co yield i => co_await promise.yield_value(i)
    }
}

int main()
{
    /*
    Steps:
    1. he coroutine handle is created.
    2. The coroutine runs until it hits co_yield and suspends itself. The control is returned 
    by the caller while the return value of co_yield is captured in the promise.
    3. The calling program retrieves this value and resumes the coroutine by invoking the handle.
    4. The coroutine picks up from the point where it was suspended and runs until the next co_yield.
    */

    auto h = coro().h_;
    
    for (int i = 0; i < 3; ++i) {
        std::cout << "counter: " << h.promise().value_ << std::endl;
        h();
    }
    
    h.destroy();
}


/* Output:
suspend 0 was -1
counter: 0
suspend 1 was 0
counter: 1
suspend 2 was 1
counter: 2
suspend 3 was 2
*/

About this code:

The coroutine coro() looks like any other function, except for the co_yield operator. This operator suspends the coroutine and returns the co-yield value i to the caller. Because the coroutine is suspended, not terminated, the operator can be executed multiple times.
The return type of the coroutine generator<int> is a special type that we are about to define. It has a lot of requirements on it, which results in lengthy boilerplate code.
The handle is a callable object. Calling it resumes the coroutine, which generates the next value and promptly suspends itself again because the co_yield operator is in the loop.
The entire system is bound by convention: there are expectations on all types involved in this process, and something somewhere will not compile if these expectations are not met.

Work Stealing

In this case, the coroutine starts to execute on one thread, is suspended, and then runs the rest of its code on another thread. Let us see an example:

First, We change the way that the task is suspended from the previous example: We do it with co_await instead of co_yield. Operator co_await is actually the most general way to suspend a coroutine. co_await’s argument is an awaiter object with very general functionality. There are, again, specific requirements on the type of this object.

Indeed, the co_yield x operator is equivalent to a particular invocation of co_await:

co_await promise.yield_value(x);

The required interface of an awaitable is the three methods:

await_ready(): It is called after the coroutine is suspended. If it returns true, then the result of the coroutine is ready, and it is not really necessary to suspend it. In practice, it almost always returns false, which leads to suspension of the coroutine.
await_resume(): It is called just before the coroutine continues to execute after it is resumed.
await_suspend(): It is called with the handle of the current coroutine when this coroutine is suspended and can have several different return types and values. If it returns void, the coroutine is suspended, and the control is returned to the caller or resumer.

struct awaitable {
    std::jthread& t3;
    bool await_ready() { return false; }

    // It does not resume the coroutine. Instead, it creates a new thread that will
    // execute a callable object, and it is this object that resumes the coroutine.
    void await_suspend(std::coroutine_handle<> h) {
        std::jthread& out = t3;
        out = std::jthread([h] { h.resume(); });
        std::cout << "New thread ID: " << out.get_id() << '\n';
    }
    void await_resume() {}
    ~awaitable() {
        std::cout << "Avaitable destroyed on thread: " << std::this_thread::get_id() << " with thread " << t3.get_id() << '\n';
    }
    awaitable(std::jthread& t) : t3(t) {
        std::cout << "Avaitable constructed on thread: " << std::this_thread::get_id() << '\n';
    }
};

Now let us build the smallest possible return type of a coroutine that contains all the required boilerplate and nothing else:

struct task{
    struct promise_type {
        task get_return_object() { return {}; }
        std::suspend_never initial_suspend() { return {}; }
        std::suspend_never final_suspend() noexcept { return {}; }
        void return_void() {}
        void unhandled_exception() {}
    };
};

Then, We move to the coroutine itself:

Note: we use co_awiat instead of co_yield

task coro(std::jthread& t1, std::jthread& t2, int i) {
    std::cout << "Coroutine started on thread: " << std::this_thread::get_id() << " i=" << i << '\n';
    co_await awaitable{t1};
    std::cout << "Coroutine resumed on thread: " << std::this_thread::get_id() << " i=" << i << '\n';
    
    std::cout << "Coroutine done on thread: " << std::this_thread::get_id() << " i=" << i << '\n';
}

Putting all together to see the output of our program:

int main() {
    std::cout << "Main thread: " << std::this_thread::get_id() << '\n';
    {
        std::jthread t1, t2;
        coro(t1, t2, 42);
        std::cout << "Main thread done: " << std::this_thread::get_id() << std::endl;
    }
    std::cout << "Main thread really done: " << std::this_thread::get_id() << std::endl;
}

/* Output:
Main thread: 16340
Coroutine started on thread: 16340 i=42
Avaitable constructed on thread: 16340
New thread ID: 19492
Main thread done: 16340
Avaitable destroyed on thread: 19492 with thread 19492
Coroutine resumed on thread: 19492 i=42
Coroutine done on thread: 19492 i=42
Main thread really done: 16340
*/

Steps are following:

The main thread calls a coroutine.
The coroutine is suspended by operator co_await. This process involves several calls to the member functions of the awaitable object, one of which creates a new thread whose payload resumes the coroutine (the game with move-assigning thread objects is done so we delete the new thread in the main program and avoid some nasty race conditions).
Control is returned to the caller of the coroutine, so the main thread continues to run from the line after the coroutine call. It will block in the destructor of the thread object t if it gets there before the coroutine completes.
The coroutine is resumed by the new thread and continues to execute on that thread from the line after co_await. The awaitable object that was constructed by co_await is destroyed. The coroutine runs to the end, all on the second thread. Reaching the end of the coroutine means it’s done, just like any other function. The thread that runs the coroutine now can be joined. If the main thread was waiting for the destructor of thread t to complete, it now unblocks and joins the thread (if the main thread has not yet reached the destructor, it won’t block when it does).

Coroutines (C++20)

The Foundations of Coroutines

Stack Frame

Coroutine Runtime Steps

Summary

Syntax in C++

ReturnType

promise_type

coroutine_handle<>

Awaitable

Workflow Illustration

Getting data out of a coroutine

Getting data into a coroutine

Yielding values

Examples

Hello World

About this diagram:

Interleaving two std::vector

Lazy Generator

About this code:

Work Stealing

Steps are following:

Reference

`ReturnType`

`promise_type`

`coroutine_handle<>`

`Awaitable`

Interleaving two `std::vector`