Tuesday, May 5, 2009

MultiJava: Support of "Multiple Dispatch" in Java

Multiple dispatch allows the code invoked by a method call to depend on the run-time type of all the arguments, instead of just the receiver. This is useful for event handlers and for binary methods, like equals in Java.

MultiJava introduces a new keyword, resend, which allows a (multi)method to invoke the directly overridden (multi)method. In this way, resends generalize Java's super construct to handle multimethods. For backward-compatibility with Java, super in MultiJava has the same semantics as in Java. This document describes MultiJava's resend facility and compares it with super. Familiarity with other aspects of MultiJava is assumed. Note that this document supercedes the discussion of super and "upcalls" in our OOPSLA 2000 paper and in Curtis Clifton's masters thesis.

We begin with a simple example.

class C {
void m(Object o) {
System.out.println("got a C and an Object");
}
}
class D extends C {
void m(final Object o) {
System.out.println("got a D and an Object");
super.m(o);
this.resend(o);
}
}

Now consider the following client code.

D d = new D();
d.m("hello");

When this code is run, the output looks as follows:

got a D and an Object
got a C and an Object
got a C and an Object

In other words, the super invocation and the resend both cause the overridden m method in C to be invoked.

The difference between super and resend becomes apparent in the presence of multimethods. Suppose class D is modified as follows:

class D extends C {
void m(Object o) {
System.out.println("got a D and an Object");
}
void m(final Object@String s) {
System.out.println("got a D and a String");
super.m(s);
this.resend(s);
}
}

On the same client code as used in the original example, we now get the following output:

got a D and a String
got a C and an Object
got a D and an Object

A super send always invokes a method in some superclass of the enclosing class. More specifically, a super send will invoke the most-specific method for the given actual arguments that appears in some (possibly transitive) superclass of the enclosing class. On the other hand, resend simply invokes the most-specific overridden method of the enclosing method, even if that overridden method exists in the same class.

In situations when super and resend would have different behavior, as in the above example, we strongly recommend using resend. Using super may have the unintended consequence of skipping over some relevant methods. In these cases, the mjc compiler issues a caution, which kindly suggests that a super send be replaced with a resend.

Here's one last example.

class C {
void m(Object o) {
System.out.println("got a C and an Object");
}
void m(Object@String s) {
System.out.println("got a C and a String");
}
}
class D extends C {
void m(Object o) {
System.out.println("got a D and an Object");
}
void m(final Object@String s) {
System.out.println("got a D and a String");
super.m(s);
this.resend(s);
}
}

Now what will be output when our sample code is executed? Of course, first we'll have

got a D and a String

From the super invocation, we'll then have

got a C and a String

since the second m method in C is the most-specific method for the actual arguments that is declared in some superclass of D.

What will happen when the resend is executed? Actually, the resend (and all other statements) will never be executed, because a compile-time error will occur when D is compiled. The problem is that there is no unique most-specific overridden method for the resend in the second m method: both the first method in D and the second method in C are overridden, but neither is more specific than the other. In general, a resend will only typecheck successfully if a unique most-specific overridden method is found at compile-time.

The resend mechanism is less flexible than super in certain ways. In particular, Java allows super to be used to invoke methods that do not override the current one. For example, super can be used to invoke a completely different operation from the enclosing one. On the other hand, resend will always invoke a method that is overridden by the current one. To ensure this is the case, the arguments to a resend invocation must obey some rules.

  • The receiver argument to a resend must be this. Because of this restriction, as a convenience we allow the leading this. of a resend to be omitted. For example, the resend in the last version of class D above could be written resend(s); instead. This abbreviated form is also necessary when employing resend within a static method, which has no associated this instance.
  • The ith actual argument to the resend must be exactly the ith formal name of the enclosing method. (In the future, this restriction will be relaxed to allow arbitrary actual arguments at unspecialized argument positions.) To ensure that the method invoked by the resend is the next most-specific method, we need to prevent assignment to the formals. We achieve this by requiring the formals to be declared final in the enclosing method, as illustrated in our earlier examples. A compile-time error results if a formal is used in a resend but is not declared final (even if the code does not actually change it).

Value Dispatching

Value dispatching allows values of primitive types to be dynamically dispatched upon, providing a form of functional-style pattern matching. It is best illustrated with an example.

class C {
void m(int i) {
System.out.println("got an integer");
}
void m(int@@3 i) {
System.out.println("got 3");
}

public static void main(String[] args) {
C c = new C();
c.m(4);
c.m(3);
}
}

The first m method in C accepts any integer as an argument. The second m method overrides the first one and will be invoked if the actual argument dynamically has the value 3. Note that value dispatching is signifed by @@, as opposed to the single @ used for ordinary multimethod dispatching.

When class C is compiled and executed, the output is as follows:

got an integer
got 3

As with ordinary MultiJava multimethods, the compiler statically checks for potential incompletenesses and ambiguities. For example, an incompleteness error would be signaled at compile-time if the first m method above were removed, since integers other than 3 would not be handled. Similarly, if another m method dispatching on 3 were added to C, an ambiguity error would be signaled at compile-time.

Aside from using a literal (like 3) after the @@, in general any compile-time constant expression (as defined by the Java language) of the appropriate type may be used. A common idiom is to employ static, final fields. Here's an example:

class C {
static final int INITIALIZED = 0;
static final int RUNNING = 1;
static final int STOPPED = 2;
void m(int i) {
// the default method
}
void m(int@@INITIALIZED i) {
// handle the case when we're in the initialized `state'
}
void m(int@@RUNNING i) {
// handle the case when we're in the running `state'
}
void m(int@@STOPPED i) {
// handle the case when we're in the stopped `state'
}
}

Value dispatching is supported for all of Java's primitive types, as well as for java.lang.String (e.g. String@@"hi"). And of course, multiple arguments in a method may use value dispatching, and value dispatching may be used in the same method with ordinary (class) dispatching.

Event Dispatching

This mini-tutorial shows how MultiJava's support for multiple dispatch can be used to make event handling code easier to write, understand, and maintain.

In a typical event-based system, clients register in some fashion to be notified when events occur. In an object-oriented implementation, event notification consists of invoking a designated method of the client. To make this work, there is some abstract class or interface that all clients wishing to act as event handlers must implement. In our running example, we use the following class for that purpose:

public abstract class AbstractHandler {
public abstract void handleEvent(Event e);
}

Clients subclass from AbstractHandler, providing a concrete implementation of handleEvent, which is invoked when an event occurs, passing the event as an argument. This event is an instance of some subclass of the top-level Event class (whose definition is not shown). To determine how to react to the event, clients first have to figure out which kind of event was received. In Java, this tends to look (in the simplest case) as follows:

public class MyJavaHandler extends AbstractHandler {
public void handleEvent(Event e) {
if (e instanceof Event1) {
Event1 e1 = (Event1) e;
// handle Event1
} else if (e instanceof Event2) {
Event2 e2 = (Event2) e;
// handle Event2
}
...
} else if (e instanceof EventN) {
EventN eN = (EventN) e;
// handle EventN
} else {
// handle the case when an unexpected event was sent
}
}
}

This way of implementing event handlers is undesirable for several reasons. The logic for dispatching to the appropriate "handler" for a particular kind of event must be hand-coded using a monolithic if statement. If the various Event subclasses form a hierarchy, the implementer must ensure that the cases of the if are in the right order, such that the most-specific handler for each event will be chosen. The use of run-time type testing (instanceof) and casting is tedious to implement and error-prone.

Here's how the same event handler can be written in MultiJava:

public class MyMultiJavaHandler extends AbstractHandler {
public void handleEvent(Event@Event1 e1) {
// handle Event1
}
public void handleEvent(Event@Event2 e2) {
// handle Event2
}
...
public void handleEvent(Event@EventN eN) {
// handle EventN
}
public void handleEvent(Event e) {
// handles the case when an unexpected event was sent
}
}

The only extension to Java used above is the Type1@Type2 syntax for an argument type. An argument with such a type is called a specialized argument, and the associated method is called a multimethod. Consider the first multimethod in the example above. In the specializer Event@Event1, class Event denotes the static type of the argument. All four methods above have the same static argument type, meaning that the methods are all part of the same family of methods. The class Event1 in the first multimethod above denotes the dynamic type of the argument: at run-time, the method will only be invoked if the argument is an instance of Event1 or a subclass.

MultiJava resolves the deficiencies in MyJavaHandler described earlier. Each "handler" is naturally and declaratively implemented as its own method, rather than with instanceofs and casts. Unlike with an if statement, the order of these methods has no effect. Instead, at run-time the unique most-specific method in the current family for the given actual arguments will be invoked. For example, when an Event1 instance is sent to an instance of the above class, both the first and the last methods are applicable, but the first one will be invoked because it is more specific.

Further, MultiJava provides compile-time assurance that such a unique most-specific method will exist for any possible argument. For example, if the last handleEvent method above, which is the default handler, were omitted, the compiler would issue an incompleteness error, because it would be possible for some kinds of events to have no applicable handler. As another example, if there were a second handleEvent method handling Event2, the compiler would issue an ambiguity error. (More non-trivial kinds of ambiguities are also possible.)

There are some additional benefits due to the fact that handlers are implemented as separate methods. First, such methods can be inherited, just like ordinary methods. As a simple example, we could modify the AbstractHandler class to perform some default behavior for events that are unexpected by clients:

public abstract class AbstractHandler {
public void handleEvent(Event e) {
// handle the case when an unexpected event occurs
}
}

This method is then inherited by all subclasses, who now need only provide handlers for the particular kinds of events they expect. For example, MyMultiJavaHandler can safely remove its default method: all unexpected events will be dispatched to the implementation in AbstractHandler.

More generally, MultiJava opens up the possibility of having hierarchies of event handlers, which would be quite unnatural in ordinary Java. For example, the following subclass of MyMultiJavaHandler overrides the behavior for handling Event2, while inheriting the functionality for handling all other kinds of events.

public class MySpecialMultiJavaHandler extends MyMultiJavaHandler {
public void handleEvent(Event@Event2 e2) {
// handle Event2 specially
}
}

On a related note, see MultiJava's resend construct, which generalizes Java's super to allow multimethods to invoke their overridden multimethods.

Another benefit of handlers being separate methods is that they can be documented separately, which would be less likely to happen in a monolithic implementation. We provide tool support for such documentation with mjdoc, which generates javadoc-style documentation for MultiJava source code.

Multimethods are more general than their use in the above examples. First, any subset of a method's arguments may be specialized. Second, MultiJava supports value dispatching, which allows the values of primitive arguments (e.g. integers, booleans) and Strings to be employed in determining which method to invoke. This can be useful in the context of event handling if event kinds are distinguished by (often integer or String) "tags" rather than by separate subclasses of Event. It can be similarly useful if the handler's functionality partially depends on its current "state" (e.g. initialized, running, stopped), in addition to the event type.

No comments:

Post a Comment