Go to the first, previous, next, last section, table of contents.


Exception Handling

Note, exception handling in g++ is still under development.

This section describes the mapping of C++ exceptions in the C++ front-end, into the back-end exception handling framework.

The basic mechanism of exception handling in the back-end is unwind-protect a la elisp. This is a general, robust, and language independent representation for exceptions.

The C++ front-end exceptions are mapping into the unwind-protect semantics by the C++ front-end. The mapping is describe below.

When -frtti is used, rtti is used to do exception object type checking, when it isn't used, the encoded name for the type of the object being thrown is used instead. All code that originates exceptions, even code that throws exceptions as a side effect, like dynamic casting, and all code that catches exceptions must be compiled with either -frtti, or -fno-rtti. It is not possible to mix rtti base exception handling objects with code that doesn't use rtti. The exceptions to this, are code that doesn't catch or throw exceptions, catch (...), and code that just rethrows an exception.

Currently we use the normal mangling used in building functions names (int's are "i", const char * is PCc) to build the non-rtti base type descriptors for exception handling. These descriptors are just plain NULL terminated strings, and internally they are passed around as char *.

In C++, all cleanups should be protected by exception regions. The region starts just after the reason why the cleanup is created has ended. For example, with an automatic variable, that has a constructor, it would be right after the constructor is run. The region ends just before the finalization is expanded. Since the backend may expand the cleanup multiple times along different paths, once for normal end of the region, once for non-local gotos, once for returns, etc, the backend must take special care to protect the finalization expansion, if the expansion is for any other reason than normal region end, and it is `inline' (it is inside the exception region). The backend can either choose to move them out of line, or it can created an exception region over the finalization to protect it, and in the handler associated with it, it would not run the finalization as it otherwise would have, but rather just rethrow to the outer handler, careful to skip the normal handler for the original region.

In Ada, they will use the more runtime intensive approach of having fewer regions, but at the cost of additional work at run time, to keep a list of things that need cleanups. When a variable has finished construction, they add the cleanup to the list, when the come to the end of the lifetime of the variable, the run the list down. If the take a hit before the section finishes normally, they examine the list for actions to perform. I hope they add this logic into the back-end, as it would be nice to get that alternative approach in C++.

On an rs6000, xlC stores exception objects on that stack, under the try block. When is unwinds down into a handler, the frame pointer is adjusted back to the normal value for the frame in which the handler resides, and the stack pointer is left unchanged from the time at which the object was thrown. This is so that there is always someplace for the exception object, and nothing can overwrite it, once we start throwing. The only bad part, is that the stack remains large.

The below points out some things that work in g++'s exception handling.

All completely constructed temps and local variables are cleaned up in all unwinded scopes. Completely constructed parts of partially constructed objects are cleaned up. This includes partially built arrays. Exception specifications are now handled. Thrown objects are now cleaned up all the time. We can now tell if we have an active exception being thrown or not (__eh_type != 0). We use this to call terminate if someone does a throw; without there being an active exception object. uncaught_exception () works. Exception handling should work right if you optimize. Exception handling should work with -fpic or -fPIC.

The below points out some flaws in g++'s exception handling, as it now stands.

Only exact type matching or reference matching of throw types works when -fno-rtti is used. Only works on a SPARC (like Suns) (both -mflat and -mno-flat models work), SPARClite, Hitachi SH, i386, arm, rs6000, PowerPC, Alpha, mips, VAX, m68k and z8k machines. SPARC v9 may not work. HPPA is mostly done, but throwing between a shared library and user code doesn't yet work. Some targets have support for data-driven unwinding. Partial support is in for all other machines, but a stack unwinder called __unwind_function has to be written, and added to libgcc2 for them. The new EH code doesn't rely upon the __unwind_function for C++ code, instead it creates per function unwinders right inside the function, unfortunately, on many platforms the definition of RETURN_ADDR_RTX in the tm.h file for the machine port is wrong. See below for details on __unwind_function. RTL_EXPRs for EH cond variables for && and || exprs should probably be wrapped in UNSAVE_EXPRs, and RTL_EXPRs tweaked so that they can be unsaved.

We only do pointer conversions on exception matching a la 15.3 p2 case 3: `A handler with type T, const T, T&, or const T& is a match for a throw-expression with an object of type E if [3]T is a pointer type and E is a pointer type that can be converted to T by a standard pointer conversion (_conv.ptr_) not involving conversions to pointers to private or protected base classes.' when -frtti is given.

We don't call delete on new expressions that die because the ctor threw an exception. See except/18 for a test case.

15.2 para 13: The exception being handled should be rethrown if control reaches the end of a handler of the function-try-block of a constructor or destructor, right now, it is not.

15.2 para 12: If a return statement appears in a handler of function-try-block of a constructor, the program is ill-formed, but this isn't diagnosed.

15.2 para 11: If the handlers of a function-try-block contain a jump into the body of a constructor or destructor, the program is ill-formed, but this isn't diagnosed.

15.2 para 9: Check that the fully constructed base classes and members of an object are destroyed before entering the handler of a function-try-block of a constructor or destructor for that object.

build_exception_variant should sort the incoming list, so that it implements set compares, not exact list equality. Type smashing should smash exception specifications using set union.

Thrown objects are usually allocated on the heap, in the usual way. If one runs out of heap space, throwing an object will probably never work. This could be relaxed some by passing an __in_chrg parameter to track who has control over the exception object. Thrown objects are not allocated on the heap when they are pointer to object types. We should extend it so that all small (<4*sizeof(void*)) objects are stored directly, instead of allocated on the heap.

When the backend returns a value, it can create new exception regions that need protecting. The new region should rethrow the object in context of the last associated cleanup that ran to completion.

The structure of the code that is generated for C++ exception handling code is shown below:

Ln:					throw value;
        copy value onto heap
        jump throw (Ln, id, address of copy of value on heap)

                                        try {
+Lstart:	the start of the main EH region
|...						...
+Lend:		the end of the main EH region
                                        } catch (T o) {
						...1
                                        }
Lresume:
        nop	used to make sure there is something before
                the next region ends, if there is one
...                                     ...

        jump Ldone
[
Lmainhandler:    handler for the region Lstart-Lend
	cleanup
] zero or more, depending upon automatic vars with dtors
+Lpartial:
|        jump Lover
+Lhere:
        rethrow (Lhere, same id, same obj);
Lterm:		handler for the region Lpartial-Lhere
        call terminate
Lover:
[
 [
        call throw_type_match
        if (eq) {
 ] these lines disappear when there is no catch condition
+Lsregion2:
|	...1
|	jump Lresume
|Lhandler:	handler for the region Lsregion2-Leregion2
|	rethrow (Lresume, same id, same obj);
+Leregion2
        }
] there are zero or more of these sections, depending upon how many
  catch clauses there are
----------------------------- expand_end_all_catch --------------------------
                here we have fallen off the end of all catch
                clauses, so we rethrow to outer
        rethrow (Lresume, same id, same obj);
----------------------------- expand_end_all_catch --------------------------
[
L1:     maybe throw routine
] depending upon if we have expanded it or not
Ldone:
        ret

start_all_catch emits labels: Lresume, 

The __unwind_function takes a pointer to the throw handler, and is expected to pop the stack frame that was built to call it, as well as the frame underneath and then jump to the throw handler. It must restore all registers to their proper values as well as all other machine state as determined by the context in which we are unwinding into. The way I normally start is to compile:

void *g; foo(void* a) { g = a; }

with -S, and change the thing that alters the PC (return, or ret usually) to not alter the PC, making sure to leave all other semantics (like adjusting the stack pointer, or frame pointers) in. After that, replicate the prologue once more at the end, again, changing the PC altering instructions, and finally, at the very end, jump to `g'.

It takes about a week to write this routine, if someone wants to volunteer to write this routine for any architecture, exception support for that architecture will be added to g++. Please send in those code donations. One other thing that needs to be done, is to double check that __builtin_return_address (0) works.

Specific Targets

For the alpha, the __unwind_function will be something resembling:

void
__unwind_function(void *ptr)
{
  /* First frame */
  asm ("ldq $15, 8($30)"); /* get the saved frame ptr; 15 is fp, 30 is sp */
  asm ("bis $15, $15, $30"); /* reload sp with the fp we found */

  /* Second frame */
  asm ("ldq $15, 8($30)"); /* fp */
  asm ("bis $15, $15, $30"); /* reload sp with the fp we found */

  /* Return */
  asm ("ret $31, ($16), 1"); /* return to PTR, stored in a0 */
}

However, there are a few problems preventing it from working. First of all, the gcc-internal function __builtin_return_address needs to work given an argument of 0 for the alpha. As it stands as of August 30th, 1995, the code for BUILT_IN_RETURN_ADDRESS in `expr.c' will definitely not work on the alpha. Instead, we need to define the macros DYNAMIC_CHAIN_ADDRESS (maybe), RETURN_ADDR_IN_PREVIOUS_FRAME, and definitely need a new definition for RETURN_ADDR_RTX.

In addition (and more importantly), we need a way to reliably find the frame pointer on the alpha. The use of the value 8 above to restore the frame pointer (register 15) is incorrect. On many systems, the frame pointer is consistently offset to a specific point on the stack. On the alpha, however, the frame pointer is pushed last. First the return address is stored, then any other registers are saved (e.g., s0), and finally the frame pointer is put in place. So fp could have an offset of 8, but if the calling function saved any registers at all, they add to the offset.

The only places the frame size is noted are with the `.frame' directive, for use by the debugger and the OSF exception handling model (useless to us), and in the initial computation of the new value for sp, the stack pointer. For example, the function may start with:

lda $30,-32($30)
.frame $15,32,$26,0

The 32 above is exactly the value we need. With this, we can be sure that the frame pointer is stored 8 bytes less--in this case, at 24(sp)). The drawback is that there is no way that I (Brendan) have found to let us discover the size of a previous frame inside the definition of __unwind_function.

So to accomplish exception handling support on the alpha, we need two things: first, a way to figure out where the frame pointer was stored, and second, a functional __builtin_return_address implementation for except.c to be able to use it.

Or just support DWARF 2 unwind info.

New Backend Exception Support

This subsection discusses various aspects of the design of the data-driven model being implemented for the exception handling backend.

The goal is to generate enough data during the compilation of user code, such that we can dynamically unwind through functions at run time with a single routine (__throw) that lives in libgcc.a, built by the compiler, and dispatch into associated exception handlers.

This information is generated by the DWARF 2 debugging backend, and includes all of the information __throw needs to unwind an arbitrary frame. It specifies where all of the saved registers and the return address can be found at any point in the function.

Major disadvantages when enabling exceptions are:

Backend Exception Support

The backend must be extended to fully support exceptions. Right now there are a few hooks into the alpha exception handling backend that resides in the C++ frontend from that backend that allows exception handling to work in g++. An exception region is a segment of generated code that has a handler associated with it. The exception regions are denoted in the generated code as address ranges denoted by a starting PC value and an ending PC value of the region. Some of the limitations with this scheme are:

The above is not meant to be exhaustive, but does include all things I have thought of so far. I am sure other limitations exist.

Below are some notes on the migration of the exception handling code backend from the C++ frontend to the backend.

NOTEs are to be used to denote the start of an exception region, and the end of the region. I presume that the interface used to generate these notes in the backend would be two functions, start_exception_region and end_exception_region (or something like that). The frontends are required to call them in pairs. When marking the end of a region, an argument can be passed to indicate the handler for the marked region. This can be passed in many ways, currently a tree is used. Another possibility would be insns for the handler, or a label that denotes a handler. I have a feeling insns might be the the best way to pass it. Semantics are, if an exception is thrown inside the region, control is transferred unconditionally to the handler. If control passes through the handler, then the backend is to rethrow the exception, in the context of the end of the original region. The handler is protected by the conventional mechanisms; it is the frontend's responsibility to protect the handler, if special semantics are required.

This is a very low level view, and it would be nice is the backend supported a somewhat higher level view in addition to this view. This higher level could include source line number, name of the source file, name of the language that threw the exception and possibly the name of the exception. Kenner may want to rope you into doing more than just the basics required by C++. You will have to resolve this. He may want you to do support for non-local gotos, first scan for exception handler, if none is found, allow the debugger to be entered, without any cleanups being done. To do this, the backend would have to know the difference between a cleanup-rethrower, and a real handler, if would also have to have a way to know if a handler `matches' a thrown exception, and this is frontend specific.

The stack unwinder is one of the hardest parts to do. It is highly machine dependent. The form that kenner seems to like was a couple of macros, that would do the machine dependent grunt work. One preexisting function that might be of some use is __builtin_return_address (). One macro he seemed to want was __builtin_return_address, and the other would do the hard work of fixing up the registers, adjusting the stack pointer, frame pointer, arg pointer and so on.


Go to the first, previous, next, last section, table of contents.