Conman Laboratories

Better living through software …

Component Object Models

Mark Grosberg

More than any other software engineering technology, component object models (COM), are one of the most powerful and important. Object orientation is a great starting point, but without a common model object-oriented code often gets written as if by many feuding lords.

The worst problem that lack of a consistent object model causes is duplication of code. While object-oriented techniques (through inheritance) were supposed to prevent this kind of duplication it often doesn't. The problem is because software developers always face the same initial problems when starting a new project.

This isn't a major problem in its self. However, in large software systems where large portions of the system are developed by many teams with many programmers these common problems end up being solved multiple times, and worse, often in different ways.

Having a single, standard component object model can alleviate many problems in large software, code duplication being only one of them. Software without a common model often has a “designed by comittee” feel to it. The different features don't seem to mesh well.

What Constitutes a Component Object Model?

A COM is really a common environment for the objects of a program to live in. The concept dates back to the earliest days of object-orientation. Frameworks and class libraries can be considered a primitive COM.

Most high-level languages (often called scripting languages) have these facilities built-in. Of course, the minute objects in one language need to communicate with another, the same problem raises its ugly head. For the concept to work, a COM must be embedded in the lowest levels of the system (typically the operating system). Looked at in that light, we can almost say that a component object model is the runtime of a scripting language without the scripting language.

A more modern component object model has several key factors, notable:

Object Management

There is nothing more irritating to a programmer who is trying to reuse code than different models of resource management. This is a vivid issue to anybody who has ever had to use two or more libraries with different philosophies in the same piece of software.

This often results in quite a bit of unecessary “glue code” to keep the two libraries from fighting. All of this glue and the bugs that come without wouldn't be necessary if all libraries shared a common framework for managing the lifetime of objects.

While garbage collection is straight forward, the real power of a component object model is factories. The trick is designing a way to address the different classes of objects easily and uniformly. This goal is important because code may be creating objects of classes that didn't even exist when that code was created.

A good example of this is the classic “component document” problem. Suppose the user is writing a document and wishes to include a graph, URL or animation. It is the responsibility of the document editor to create the requested object and the embed it in the document. Of course, the list of embeddable objects is open ended such that some simple schemes break down.

The most flexible solution to this problem is to treat classes of objects as objects too. The benefit of this solution is that there only needs to be one way to manipulate both classes and objects.

Dynamic Data Representation

Seasoned C++ programmers know all to well why lack of a common data representation is a problem. During the early years, C++ provided no pre-canned string class. To solve this deficiency, programmers often rolled their own. It is not uncommon for large C++ software to have three or more different (and incompatible) string classes.

The problem goes far beyond strings, containers (lists, indexes), wrapper classes for scalar objects, and specialized data structures all suffer from this problem. Sharing a common library of basic objects allows different objects to easily call upon each other and exchange values without the evils of glue code.

In fact, for containers it is often beneficial to have a common interface for common actions such as iterating over contents. The ideal situation is for an object to be able to work on a set of data, regardless of that data coming from a real time device, a SQL database, or a simple array in memory.

Event Dispatch

Object oriented systems really are organized as communities of related objects. It is often considered impolite to burst into a community and forcibly change things. The best approach is to require that the communities cooperate through very strict interfaces (stricter than the internal interfaces between objects in the same community).

This parallels the way human families, towns, and cities work. In addition, it makes things like network transparency easier as objects are not manipulated directly by their memory address. An added bonus of this approach is improved isolation; with hardware memory protection the communities of objects can be protected from external damage easily.

It only makes sense to treat events as another type of object (just like classes). This way events can be managed like any other object. This model of events as first-class objects can be made to do some very powerful things:

Putting it Together

Once the above basic components are realized several important things become possible. First, software becomes very flexible. New components can be introduced without requiring changes to any of the previous components. Even though specific features of the new components can not be used by older code at least the compatability problems disappear. Going back to the component document example, users can expect a graceful degredation when a document with components unknown to their systems is encountered.

Second, everything becomes scriptable. Because all objects present a minimal common interface it becomes possible to auotmate objects using a generic scripting facility. This is exactly the concept employed by the DOM used by DHTML. In this case, the scripting language (JavaScript) can manipulate document objects without concern for the type of object (image, text, style, plug-in, etc).

Third, systems become more intuitive for their users. Ideally, common operations should work the same way reguardless of the type of object. The component object model enforces this idea.

Problems with Component Object Models

The biggest problem with COM technology is the lack of a single unified standard. Currently the best case is that each operating system supports its own COM, the worst is each application sporting its own COM. In common use today there are a dozen popular models all competing against one another: Microsoft COM & .NET, CORBA, Mozilla's XPCOM, Enterprise Java Beans, and a whole host of other less well-known implementations.

The true power of component object models can never realized until a single, unified standard is adopted by vendors. More importantly, for such a standard to flourish it must be open, compact, portable and flexible. Ideally, a free implementation should be available to promote growth. The closest to the ideal is currently XPCOM, although it is still large and missing a few key features for some groups to adopt it.

The reality of multiple standards isn't as bad as it seems at first light. While not ideal, glue code between different COM implementations is far simpler and smaller than glue code for specific components.