Exceptions in v8

Written by Michael Stanton

How do exceptions work in v8? Let’s see.

I’ll use a little program to motivate our discussion:

function foo(a) {
  if (a === true) throw new Error("Oh no.");
}

function bar(a) {
  try {
    foo(a);
  } catch(e) {
    print(e);
  }
}

bar(true);

The bytecode for foo looks like this (I cleaned up the output from --print-bytecode slightly):

Generated bytecode for function: foo
Parameter count 2
Register count 2
Frame size 8
0            StackCheck
1            LdaTrue
2            TestEqualStrict a0, [0]
5            JumpIfFalse [19] (@ 24)
7            LdaGlobal [0], [1]
10           Star r0
12           LdaConstant [1]
14           Star r1
16           Ldar r0
18           Construct r0, r1-r1, [3]
23           Throw
24           LdaUndefined
25           Return
Constant pool (size = 2)
       0: <String[#5]: Error>
       1: <String[#6]: Oh no.>
Handler Table (size = 0)

The important thing here is that if the JumpIfFalse isn’t taken, then we create an Error object, leaving it in the accumulator for the Throw bytecode. We don’t expect to ever return from that bytecode. How strange and interesting…

We’ll get right into that, but first let me show you the bytecode for bar as well, because it has a catch handler that’ll be a point of interest for us:

Generated bytecode for function: bar
Parameter count 2
Register count 4
Frame size 16
0            StackCheck
1            Mov <context>, r0
4            LdaGlobal [0], [0]
7            Star r1
9            CallUndefinedReceiver1 r1, a0, [2]
13           Jump [30] (@ 43)
15           Star r1
17           CreateCatchContext r1, [1]
20           Star r0
22           LdaTheHole
23           SetPendingMessage
24           Ldar r0
26           PushContext r1
28           LdaGlobal [2], [4]
31           Star r2
33           LdaImmutableCurrentContextSlot [2]
35           Star r3
37           CallUndefinedReceiver1 r2, r3, [6]
41           PopContext r1
43           LdaUndefined
44           Return
Constant pool (size = 3)
       0: <String[#3]: foo>
       1: <ScopeInfo CATCH_SCOPE [5]>
       2: <String[#5]: print>
Handler Table (size = 16)
  from   to       hdlr (prediction,   data)
 (   4,  13)  ->    15 (prediction=1, data=0)

Here you can see the call to foo at offset 9, then a jump down to offset 43 where we return undefined (how boring). Offsets 15 to 41 are the catch handler. The Handler Table at the bottom lays this out. It shows the try region goes from offsets 4 to 13, and the handler that corresponds to it starts at offset 15.

Very nice.

So what does the Throw bytecode do? Like all bytecodes, the implementation is provided in file interpreter-generator.cc:

// Throws the exception in the accumulator.
IGNITION_HANDLER(Throw, InterpreterAssembler) {
  TNode<Object> exception = GetAccumulator();
  TNode<Context> context = GetContext();
  CallRuntime(Runtime::kThrow, context, exception);
  // We shouldn't ever return from a throw.
  Abort(AbortReason::kUnexpectedReturnFromThrow);
  Unreachable();
}

Okay, that’s easy, we just call a runtime function. It’s nice to see the abort in here that documents our expectations. The runtime function (in runtime-internal.cc) is also simple, just returning the result of Isolate::Throw() back from C++ to the world of generated code:

RUNTIME_FUNCTION(Runtime_Throw) {
  HandleScope scope(isolate);
  DCHECK_EQ(1, args.length());
  return isolate->Throw(args[0]);
}

You’d never suspect that we just skipped over something absolutely breathtaking, but we did. Let’s return to it after a look at the end of the road, Isolate::Throw:


Object Isolate::Throw(Object raw_exception, MessageLocation* location) {
  DCHECK(!has_pending_exception());

  HandleScope scope(this);
  Handle<Object> exception(raw_exception, this);

  ...
  ... skipping over some interesting stuff to focus on the "bones" of
  ... the function.
  ...

  // Set the exception being thrown.
  set_pending_exception(*exception);
  return ReadOnlyRoots(heap()).exception();
}

The main thing that happens here is that we save the exception object in the isolate, and return a curious sentinel value in the global root set, exception(). We only indicate that there is an exception pending in the system by saving this value, and don’t do anything exotic. The crazy stuff is yet to come, awakened into hideous life by that innocuous sentinel value.

Here’s what happens now. When the runtime function returns this value to generated code, it comes into some tricky platform dependent code we call CEntryStub. This code is responsible for building a frame and calling into C++. It also recognizes pending exceptions and (gasp!) drops frames from the stack in order to call the topmost handler. Let’s have a look, now in builtins-ia32.cc, method Builtins::Generate_CEntry:

  ...
  __ call(kRuntimeCallFunctionRegister);

  // Result is in eax or edx:eax - do not destroy these registers!

  // Check result for exception sentinel.
  Label exception_returned;
  __ CompareRoot(eax, RootIndex::kException);
  __ j(equal, &exception_returned);
  ...
  // Exit the JavaScript to C++ exit frame.
  __ LeaveExitFrame(save_doubles == kSaveFPRegs, argv_mode == kArgvOnStack);
  __ ret(0);
  ...

  // Handling of exception.
  __ bind(&exception_returned);

The code above is dutifully calling the runtime function requested. In our case, it was Runtime_Throw, but this body of code is used for the many dozens of runtime functions we have. After the call, you can see we are comparing the return value with that magic exception sentinel (here it has a different name, RootIndex::kException. We’ve often got a few ways to refer to something, depending on what kind of code you’re in. All part of the fun…). If we don’t see it, we can return to the bytecode handler, builtin or stub, getting somewhat closer to user code. Otherwise, we do interesting things:

  // Ask the runtime for help to determine the handler. This will set eax to
  // contain the current pending exception, don't clobber it.
  ExternalReference find_handler =
      ExternalReference::Create(Runtime::kUnwindAndFindExceptionHandler);
  {
    FrameScope scope(masm, StackFrame::MANUAL);
    __ PrepareCallCFunction(3, eax);
    __ mov(Operand(esp, 0 * kSystemPointerSize), Immediate(0));  // argc.
    __ mov(Operand(esp, 1 * kSystemPointerSize), Immediate(0));  // argv.
    __ Move(esi,
            Immediate(ExternalReference::isolate_address(masm->isolate())));
    __ mov(Operand(esp, 2 * kSystemPointerSize), esi);
    __ CallCFunction(find_handler, 3);
  }

  // Retrieve the handler context, SP and FP.
  __ mov(esp, __ ExternalReferenceAsOperand(pending_handler_sp_address, esi));
  __ mov(ebp, __ ExternalReferenceAsOperand(pending_handler_fp_address, esi));
  __ mov(esi,
         __ ExternalReferenceAsOperand(pending_handler_context_address, esi));

First we call back into C++, searching for an exception handler. We’ll definitely have one. I skipped over a section of code in Isolate::Throw that would abort execution if there is no handler. I really enjoy what’s next: we simply set the stack pointer and the frame pointer to the appropriate values for a handler somewhere below us on the stack. What about dutifully returning from all the functions we may have between here and there?

Nope!

A few lines later we begin executing that handler like this:

  // Compute the handler entry address and jump to it.
  __ mov(edi, __ ExternalReferenceAsOperand(pending_handler_entrypoint_address,
                                            edi));
  __ jmp(edi);
}

That final curly brace is just showing that these are the last lines of the CEntry stub. What a way to go, am I right?

Now it’s a good time to remember that we have a catch handler. It’s in function bar(), and it’s address is at offset 15 in the bytecode. So whatever Runtime_UnwindAndFindExceptionHandler does, it better return precisely that information to the CEntryStub so that the stack, frame and instruction pointer can be set appropriately.

Unwinding and finding stuff

The entirety of this work is in Isolate::UnwindAndFindHandler:

Object Isolate::UnwindAndFindHandler() {
  Object exception = pending_exception();
  ...
  // Special handling of termination exceptions, uncatchable by JavaScript and
  // Wasm code, we unwind the handlers until the top ENTRY handler is found.
  bool catchable_by_js = is_catchable_by_javascript(exception);
  ...

It begins innocently enough, picking up the pending exception object we saved earlier in the isolate. We also learn that some exceptions can’t be caught by javascript. In practice this is only a termination exception. No stopping that train…

Now we walk the stack looking for handlers among the different kinds of frames we might have. I’ll just focus on interpreted and optimized javascript code:

  ...
  // Compute handler and stack unwinding information by performing a full walk
  // over the stack and dispatching according to the frame type.
  for (StackFrameIterator iter(this);; iter.Advance()) {
    // Handler must exist.
    DCHECK(!iter.done());

    StackFrame* frame = iter.frame();

    switch (frame->type()) {
      case StackFrame::ENTRY:
      case StackFrame::CONSTRUCT_ENTRY: ...
      case StackFrame::C_WASM_ENTRY: ...
      case StackFrame::WASM_COMPILED: ...
      case StackFrame::WASM_COMPILE_LAZY: ...
      case StackFrame::WASM_INTERPRETER_ENTRY: ...
      case StackFrame::STUB: ...
      case StackFrame::BUILTIN: ...
      case StackFrame::JAVA_SCRIPT_BUILTIN_CONTINUATION_WITH_CATCH: ...
      case StackFrame::OPTIMIZED: ...
      case StackFrame::INTERPRETED: ...
      default:
        // All other types can not handle exception.
        break;
  }

Here’s what we do for an interpreted frame. As you might suspect, the verbosity of comments indicates “hairyness”:

  case StackFrame::INTERPRETED: {
    // For interpreted frame we perform a range lookup in the handler table.
    if (!catchable_by_js) break;
    InterpretedFrame* js_frame = static_cast<InterpretedFrame*>(frame);
    int register_slots = InterpreterFrameConstants::RegisterStackSlotCount(
        js_frame->GetBytecodeArray().register_count());
    int context_reg = 0;  // Will contain register index holding context.

    int offset =
        js_frame->LookupExceptionHandlerInTable(&context_reg, nullptr);
    if (offset < 0) break;

    // Compute the stack pointer from the frame pointer. This ensures that
    // argument slots on the stack are dropped as returning would.
    // Note: This is only needed for interpreted frames that have been
    //       materialized by the deoptimizer. If there is a handler frame
    //       in between then {frame->sp()} would already be correct.
    Address return_sp = frame->fp() -
                        InterpreterFrameConstants::kFixedFrameSizeFromFp -
                        register_slots * kSystemPointerSize;

    // Patch the bytecode offset in the interpreted frame to reflect the
    // position of the exception handler. The special builtin below will
    // take care of continuing to dispatch at that position. Also restore
    // the correct context for the handler from the interpreter register.
    Context context =
        Context::cast(js_frame->ReadInterpreterRegister(context_reg));
    js_frame->PatchBytecodeOffset(static_cast<int>(offset));

    Code code =
        builtins()->builtin(Builtins::kInterpreterEnterBytecodeDispatch);
    return FoundHandler(context, code.InstructionStart(), 0,
                        code.constant_pool(), return_sp, frame->fp());
  }

What first popped out of the code for me was the call to the frame to lookup the exception handler in the table. That’s where our offset 15 should appear as we search for a handler. The next interesting thing is the line js_frame->PatchBytecodeOffset(offset). Since we are dealing with bytecode we don’t have a machine instruction where our handler starts. Instead, we’ll be running the interpreter, which gets the offset from the frame.

The FoundHandler call is a helper executed on our way out, which sets those all important return values somewhere that the CEntryStub can pick them up. Here it is:

  auto FoundHandler = [&](Context context, Address instruction_start,
                          intptr_t handler_offset,
                          Address constant_pool_address, Address handler_sp,
                          Address handler_fp) {
    // Store information to be consumed by the CEntry.
    thread_local_top()->pending_handler_context_ = context;
    thread_local_top()->pending_handler_entrypoint_ =
        instruction_start + handler_offset;
    thread_local_top()->pending_handler_constant_pool_ = constant_pool_address;
    thread_local_top()->pending_handler_fp_ = handler_fp;
    thread_local_top()->pending_handler_sp_ = handler_sp;

    // Return and clear pending exception. The contract is that:
    // (1) the pending exception is stored in one place (no duplication), and
    // (2) within generated-code land, that one place is the return register.
    // If/when we unwind back into C++ (returning to the JSEntry stub,
    // or to Execution::CallWasm), the returned exception will be sent
    // back to isolate->set_pending_exception(...).
    clear_pending_exception();
    return exception;
  };

You can also see that the exception is no longer pending. It is real now, on it’s way, speeding like a bullet to the chosen handler.

Let’s have a quick look at how we deal with handlers in optimized code:

  case StackFrame::OPTIMIZED: {
    // For optimized frames we perform a lookup in the handler table.
    if (!catchable_by_js) break;
    OptimizedFrame* js_frame = static_cast<OptimizedFrame*>(frame);
    Code code = frame->LookupCode();

    int offset = js_frame->LookupExceptionHandlerInTable(nullptr, nullptr);
    if (offset < 0) break;

    // Compute the stack pointer from the frame pointer. This ensures
    // that argument slots on the stack are dropped as returning would.
    Address return_sp = frame->fp() +
                        StandardFrameConstants::kFixedFrameSizeAboveFp -
                        code.stack_slots() * kSystemPointerSize;

    ...

    return FoundHandler(Context(), code.InstructionStart(), offset,
                        code.constant_pool(), return_sp, frame->fp());
  }

The table used by the frame is the same “in spirit” as the table for an interpreted version of the function, only it’s offsets are in machine instructions. Cool.

I made sure function bar got optimized, then had a look at the code. It’s handler table looks like this:

Handler Table (size = 1)
  offset   handler
      84  ->    ae

Let’s see if that makes sense to us. Hmm:

0x36f8277c    7c  mov ecx,[edi+0x17]
0x36f8277f    7f  add ecx,0x3f
0x36f82782    82  call ecx
0x36f82784    84  mov eax,0x5ae80279 ;; object: <undefined>
0x36f82789    89  mov esp,ebp
0x36f8278b    8b  pop ebp
0x36f8278c    8c  ret 0x8

0x36f8278f    8f  mov eax,0xb8
0x36f82794    94  push eax
0x36f82796    96  mov eax,0x1
0x36f8279b    9b  mov edx,0xf7071690 ;; Runtime::StackGuardWithGap
0x36f827a0    a0  mov esi,0x2d8c0c61 ;; object: <NativeContext[261]>
0x36f827a5    a5  mov ecx,eax
0x36f827a7    a7  call 0xf580b760  (CEntry)
0x36f827ac    ac  jmp 0x36f8274a  <+0x4a>

0x36f827ae    ae  mov ecx,0x5ae802f5 ;; object: <the_hole>
0x36f827b3    b3  mov [ebx+0x18d8],ecx
0x36f827b9    b9  mov edi,0x2a9c0c21 ;; object: <JSGlobal Object>
0x36f827be    be  push edi
0x36f827c0    c0  push eax
0x36f827c2    c2  mov [ebp-0x10],eax
0x36f827c5    c5  mov edx,0x566ed920 ;; external reference (<unknown>)
0x36f827ca    ca  mov esi,0x2d8c0c61 ;; object: <NativeContext[261]>
0x36f827cf    cf  mov ecx,0x1
0x36f827d4    d4  mov eax,0x5ae80279 ;; object: <undefined>
0x36f827d9    d9  call 0xf50af0a0  (CallApiCallback)
0x36f827de    de  jmp 0x36f82784  <+0x84>

Okay, offset 84 is actually right after the call to foo. I’d feel like I knew what was going on if the offset started at 82, but on the other hand, the truth is that nothing can throw in here. Conceptually we are at offset 84 now, as we are inside the call to foo, so I guess the lookup of the handler should succeed.

If an exception doesn’t happen, we’ll tear down the frame and exit the function at offset 8c. Offsets 8f to ac are unrelated to us, they are dealing with the stack guard and can be ignored (though Jakob Gruber is doing interesting work right now to repair holes in our stack guard logic, which protects against unexpected stack overflow during optimized function deoptimization!).

Offset ae does appear to be our catch handler, though it’s a bit disguised. It’s making an api call, after which it joins the normal exit of the function. The api call must be my print(e);. Neat!

There is more to say. For one thing, what is a scheduled exception? It has to do with API boundaries. More next time, thanks for joining me!