Kotlin `suspend` Internals: From CPS Transformation to State-Machine Bytecode

After using coroutines for two years, I eventually wanted to answer a very concrete question: what does a suspend fun fetchUser() actually look like on the JVM? In the IDE it looks like a normal function, but it can suspend, resume, and hop across threads. What exactly did the compiler generate?

I decompiled a set of coroutine examples with javap and turned the full reasoning path into this article.

Start with CPS Transformation

The implementation behind a suspend function is based on continuation-passing style, or CPS. The idea comes from functional programming: pass “what to do next” explicitly as a parameter, instead of relying on the call stack to carry it implicitly.

For every suspend fun, the Kotlin compiler first appends a Continuation parameter to the end of the parameter list:

// What you write
suspend fun fetchUser(id: String): User

// What the compiler sees
fun fetchUser(id: String, continuation: Continuation<User>): Any?

The return type changes from User to Any?. That Any? can be the real User object when the function completes synchronously, or it can be the marker value COROUTINE_SUSPENDED when execution needs to suspend. The caller checks the return value to decide whether it should keep running or give up the current thread.

The Continuation<T> interface is minimal:

interface Continuation<in T> {
    val context: CoroutineContext
    fun resumeWith(result: Result<T>)
}

It is essentially a type-safe callback abstraction, except the compiler generates and passes that callback for you.

Multiple Suspension Points: The State Machine

With only one suspension point, CPS transformation is still fairly easy to visualize. Real code often looks more like this:

suspend fun loadProfile(userId: String): Profile {
    val user = fetchUser(userId)      // Suspension point 1
    val avatar = fetchAvatar(user.id) // Suspension point 2
    return Profile(user, avatar)
}

Two suspend calls mean two possible suspensions and resumptions. The compiler cannot simply nest two CPS transformations; that would collapse into callback hell. Instead, it compiles the function body into a finite state machine.

Each segment between suspension points maps to one state. For this function, the compiler generates an anonymous Continuation subclass containing:

  • a label field that records the current state
  • fields that store local variables which must survive across suspension points
  • an invokeSuspend method that contains the state machine’s when or switch branches

After decompilation, the rough shape looks like this simplified pseudocode:

// Anonymous state-machine class generated by the compiler
final class LoadProfileContinuation extends ContinuationImpl {
    int label = 0;
    Object result;
    
    // Local variables that survive across suspension points
    String userId;
    User user;
    
    @Override
    public Object invokeSuspend(Object result) {
        this.result = result;
        return loadProfile(COROUTINE_SUSPENDED_MARKER, this);
    }
}
// Transformed loadProfile body
static Object loadProfile(String userId, Continuation cont) {
    LoadProfileContinuation sm = (LoadProfileContinuation) cont;
    
    switch (sm.label) {
        case 0:
            sm.userId = userId;
            sm.label = 1;
            Object r1 = fetchUser(userId, sm); // Call the suspending function
            if (r1 == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED;
            // If it does not suspend, fall through directly to case 1
            sm.result = r1;
        case 1:
            User user = (User) sm.result;
            sm.user = user;
            sm.label = 2;
            Object r2 = fetchAvatar(user.id, sm);
            if (r2 == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED;
            sm.result = r2;
        case 2:
            Avatar avatar = (Avatar) sm.result;
            return new Profile(sm.user, avatar);
    }
}

label is the current state of the state machine. Every time the coroutine resumes, invokeSuspend is called, the function re-enters the switch, and execution jumps straight to the point after the previous suspension. On the JVM, the whole function is just an ordinary state machine. There is no magic, only a plain switch-case.

Local Variable Lifetimes

Not every local variable is stored in the state-machine object. This detail is easy to miss.

The compiler performs liveness analysis. Only variables that are still live across a suspension point are lifted into fields on the state machine:

suspend fun example() {
    val a = compute()        // a is consumed before the first suspension point
    val b = a * 2
    delay(100)               // Suspension point
    val c = fetchData()      // c crosses the second suspension point
    delay(50)                // Suspension point
    println(c)
}

Variables a and b are no longer used after delay(100), so the compiler does not store them in the state-machine object and the GC can reclaim them normally. Variable c is used after the second delay, so it is promoted to a field.

Here is a real pitfall: if you hold a reference to a large object before a suspension point and that reference crosses the suspension point, the state machine will keep it alive for the entire suspended period.

suspend fun processLargeData() {
    val data = loadHugeBitmap()  // Large object
    process(data)
    delay(5000)                  // Suspends for 5 seconds; data is still held by the state machine
    println("done")
}

The fix is to null out references that are no longer needed before the suspension point, or refactor the code so the large object does not cross that boundary.

Exceptions and Result Wrapping

resumeWith(result: Result<T>) uses Result for a reason. Coroutine resumption has two paths: normal resumption with a value, and exceptional resumption with a Throwable.

Inside the state machine, each case entry first checks result:

case 1: {
    // On every resume, first check whether the result is an exception
    ResultKt.throwOnFailure(sm.result);
    User user = (User) sm.result;
    // ...
}

throwOnFailure rethrows the exception if result represents a failure. This is why exceptions in coroutines can be caught with try-catch as if the code were synchronous: the compiler encodes the exception propagation path into the state machine’s control flow.

If you write a try-catch block that crosses a suspension point:

suspend fun safeLoad(): User? {
    return try {
        fetchUser("123")  // Suspension point
    } catch (e: IOException) {
        null
    }
}

The compiler inserts exception checks in the corresponding case branches and maps the catch block range to specific state ranges. The generated bytecode is more complex than an ordinary try-catch, but it is completely transparent to the caller.

Verify It Yourself with javap

You do not have to trust the explanation. You can decompile it yourself:

# Compile the Kotlin file
kotlinc Example.kt -include-runtime -d example.jar

# Inspect generated class files, especially the inner class containing $
jar tf example.jar | grep "\.class"

# Decompile the state-machine class
javap -c -p ExampleKt\$loadProfile\$1.class

You will see a class that extends SuspendLambda or ContinuationImpl. Its invokeSuspend method is the state machine. Field names may be obfuscated, but the label field is usually easy to identify.

An easier option is IntelliJ’s Tools -> Kotlin -> Show Kotlin Bytecode, followed by Decompile. The resulting Java-like code is often more readable.

Performance Impact and Practical Guidance

Once you understand the state-machine implementation, several common performance issues become easier to explain.

Coroutines are lightweight, but state-machine objects still allocate. Every time a suspend fun is called and actually suspends, a state-machine object is allocated on the heap. If a hot path frequently calls functions that suspend, GC pressure can grow. For paths that you know will not suspend repeatedly, consider using one withContext boundary to switch context once instead of building deep nested state machines.

Manage local variables that cross suspension points deliberately. Keeping unnecessary large-object references across suspension points is a common leak path. If a LeakCanary report shows a coroutine-related reference chain, this is often the reason.

inline suspend fun expands the state machine instead of nesting it. Standard-library functions such as withContext and coroutineScope are suspend functions, and framework code gives them special treatment. Understanding that helps when bytecode analysis seems to show tangled nested state machines.

I prefer to think of coroutines as “callbacks plus a scheduler, written by the compiler” rather than “lightweight threads.” The lightweight-thread analogy is useful for explaining suspension and resumption, but it can hide the allocation cost of the state machine. The callback-plus-scheduler mental model is closer to what actually happens on the JVM, and it is more useful during performance analysis.

Further Reading