2003 Encapsule Aurora Platform Whitepaper


The Encapsule Aurora Platform whitepaper was written in 2003 to explain the goals, potential applications, and architecture of the platform. The paper was shared only with a few close business partners and never published. Subsequently, I used the paper as my guide to implement a native C++ prototype of the system for the Windows OS (See also: 2004 Encapsule Prototype Screenshots)

Please visit the Early Encapsule Project History page for additional historical context.






Aurora Platform White Paper

Extensible Generic Platform for Building Component Software Applications and Services

Christopher D. Russell
January 2003

Copyright © 2003-2012 Encapsule Systems, Inc. All rights reserved.

This document is a preliminary draft copy of the Aurora Platform Architecture White Paper. This document contains private and confidential information that is not intended for public distribution at this time. Please do not duplicate or redistribute.


Encapsule Aurora is component software engineering platform developed by Encapsule Systems, Inc. designed to make it possible to easily convey complex software and system integration rules to laypeople. Aurora presents the user with a graphical CAD-like interface that allows them to specify and customize a software application by connecting collections of interlocking polygonal tiles that represent software components and application subsystems.

This paper introduces the reader to the basic capabilities of the Aurora platform, details its target applications, and explains its usage model. The paper concludes with a discussion of how ISV’s and OEM’s can leverage the platform to expose their hardware, software, and network-based services using Aurora’s Component Software Description Language and extensible component plug-in architecture.


[ under construction – to be written last ]

Aurora Platform Summary

The Encapsule Aurora platform comprises a set of specifications for describing component software applications, and for extending the platform through its plug-in architecture. Support for these specifications is provided by Aurora’s runtime services. The following sections provide a brief explanation of these platform elements.


Component Software Description Language

Aurora defines a new XML-format Component Software Description Language (CSDL) language that is used to document:

  • Low-level component software interfaces
  • Mid-level software design patterns expressed in terms of networks of low-level components
  • High-level software application specifications expressed in terms of networks of mid-level software design patterns

CSDL is examined in much greater detail Theory of Operation section of this paper.

Component Software Development Kit

The Aurora platform is generic in that its specific functionality depends entirely on a library of pre-compiled low-level binary components that are registered with its runtime service using CSDL.

The Component Software Development Kit (CSDK) is a C++ template library that is used to write low-level components and data type extensions compatible with the Aurora runtime service. Note: Plug-in platform extensions can be written in languages other than C++ by embedding an alternate compiler, interpreter, or scripting host in a thin C++ wrapper that marshals data to/from the Aurora runtime.

See the Theory of Operation and Customization sections of this paper for additional discussion about Aurora’s extensible plug-in model.

Runtime Services

At a high level of abstraction, the Aurora platform runtime service is architecturally partitioned into a graphical view coupled with an embedded controller. The view or graphical user interface (GUI) portion of the runtime is responsible for visualization and user interaction. The controller portion of the runtime is responsible for all CSDL processing and code synthesis.

Formally partitioning the runtime service into separate view and controller subsystems makes it possible to easily integrate the controller subsystem into server applications. Another benefit of this partitioning is that it dramatically simplifies the task of creating alternative, application-specific user interfaces.

“Fiat” Graphical User Interface

Aurora’s graphical user interface (GUI) runtime subsystem, codenamed Fiat, implements an interactive CAD-like interface based on a graphical metaphor that employs nesting colored tiles of various shapes to represent pieces of application program logic. By dragging and dropping tiles to form a graphical hierarchy, the user creates a high-level component software specification that is ultimately used to generate application code. This is best illustrated with a simple example:

Figure 1: A hypothetical collection of UI “tiles”

The diagram above shows a hypothetical collection of tiles that comprise a kit from which the user can assemble a software application specification. Figure 2 below shows two of the eight possible specifications that can be defined using the tiles defined in this example.

Figure 2: Software application specifications assembled from UI “tiles”

Aurora’s use of geometric shapes and coloration to represent pieces of application program logic allows a great deal of information to be conveyed in a succinct, consistent fashion to the user. Specifically, the shape of a tile is used to convey compatibility information (i.e. possible container/contained relationships) and color is used to uniquely identify tiles whose shapes are identical.

The design of the Aurora user interface is intended to take advantage of the fact that most people are reasonably proficient solvers of spatial relationship problems – these problems are encountered daily in the physical world: putting a key in a lock, packing a box, helping the kids put together their 10,000 block LEGO™ Moon Base…

By reducing the task of specification to a simple block stacking exercise, Aurora makes it possible to expose extremely complicated software architectures to end users who must learn only a single set of simple tile manipulation rules regardless of problem domain.

“Protein” Generic Application Server

The Aurora platform’s core runtime service, codenamed Protein, is a portable library that implements support for Aurora’s Component Software Description Language (CSDL), and plug-in platform extension specifications. Protein must be linked into a hosting application that makes calls against its exported controller interface in order to access the services it provides.

Protein is called a generic application server because it implements a generalized algorithm for dynamically assembling and sequencing an application from pre-compiled, binary components registered with the runtime via CSDL.

A point of considerable confusion arises when one initially tries to reconcile Protein’s use of binary software component building blocks with the Aurora platform’s stated role as a generative code synthesis architecture that is not tied to a particular target operating system, computing platform, or implementation language.

The subtle yet critical point to understand is that Protein is not a code generator. Rather, it is a runtime environment used to dynamically assemble code generation applications that run within a virtual machine to perform the actual code synthesis operations and produce the final output of the system.

The Theory of Operation, Applications, and Customization sections of this paper provide additional explanation and examples that will help clarify Protein’s central role in the Aurora platform architecture. The subsections below provide an overview of Protein’s constituent subsystems and provide background information relevant to these upcoming sections.

View Controller

Protein exports a controller interface that a hosting application calls to access the services it provides.

Typically Protein is hosted (that is it is linked to and controlled by) a graphical user interface that issues commands and status queries against its exported controller interface. In response to command and status queries, Protein issues callbacks that can be handled by the host to implement an interactive graphical view.

In some cases, a hosting application may choose to ignore these callbacks entirely, or handle them for some purpose other than visualization. This is often the case when Protein is hosted by a server application. (See the Applications section for more discussion on embedding Protein in server applications).

Component Software Librarian

Protein’s component software librarian subsystem is a memory- resident, cross-referenced, hierarchical database of low-level plug-in component descriptions, and software design patterns expressed in terms of interconnected networks of these low-level plug-in components.

A hosting application initializes the Protein runtime service by passing in a reference to a CSDL file that contains all the information required to load the component librarian service. The initialized librarian is subsequently accessed by the hosting application’s view (to get visualization data), and accessed by Protein’s integration server subsystem (described in the next subsection).

Integration Server

Protein’s integration server subsystem is responsible for loading low-level binary plug-in components into Protein’s virtual machine and establishing interconnections between the plug-ins according to topology data that it pulls from the component software librarian subsystem as it parses a high-level, CSDL-format software application specification passed down from the hosting application through Protein’s controller interface.

Virtual Machine

Protein’s virtual machine subsystem is essentially an in-memory data structure that represents interconnection topologies between low-level plug-in components in terms of mathematical graphs.

This low-level modeling of component interconnections provides considerable advantages over more traditional object oriented and component software frameworks that rely on hand-coded control software to instantiate components and establish interconnections between components.

The Protein virtual machine’s use of mathematical graphs to model interconnections eliminates the need to hand-code component instantiation and interconnection control software and allows the structure of the application to be programmatically analyzed to perform optimizations, and to determine sequences of operations that can be performed in parallel on multiple threads, CPU’s, or remote hosts.

Program Sequencer

Protein’s program sequencer subsystem is responsible for analyzing the contents of the virtual machine and determining the order in which the plug-in components loaded by the integration server must be executed in order to satisfy all inter-component data dependencies. This dependency data is used to produce a schedule that the program sequencer uses to dispatch plug-ins resident in the virtual machine in order to execute the program.

This is hugely significant because it eliminates the need for programmers to hand-code the logic that normally controls program flow within an application. Protein’s program sequencer performs the required analysis at runtime freeing the programmer to concentrate on the high-level design of the overall application and on writing small, light-weight plug-ins.

Runtime overhead of the program sequencer very small and in most cases performs nearly as well as highly-optimized, hand-coded C++.

Standard Component Library

[ under construction ]

Theory of Operation

Jacquard Loom Analogy

In 1819(?) the French industrialist J.M. Jacquard invented one of the world’s first programmable mechanical devices – the Jacquard loom. Jacquard’s loom automated the process of weaving intricate designs in woven textiles by selecting shuttles containing various thread colors according to a sequence of punched holes in an endlessly rotating belt.

Explained in computer science terms, Jacquard’s loom is effectively a synthesizer that combines two independent inputs to produce an output result. The inputs to Jacquard’s system are a) a program encoded on a rotating belt b) a specific set of thread shuttles. And, the output of Jacquard’s system is a woven textile whose pattern and coloration is a function of its two inputs.

An imperfect but instructive analogy can be drawn between Jacquard’s loom and the Aurora platform by metaphorically equating the Protein runtime service with the physical loom device, the component software plug-ins used by Protein with thread shuttles, and the Component Software Description Language (CSDL) specification with the rules that govern the placement of holes punched in the loom’s rotating belt.

In terms of the above analogy Fiat, Aurora’s graphical user interface runtime service, can be thought of a separate system that is used to metaphorically punch holes in belts, and select appropriately colored thread shuttles that are subsequently used by the Protein runtime service to weave an integrated software application.

This analogy reasonable captures the relationship between Fiat and Protein, and is an extremely useful vehicle for understanding Protein in terms of its CSDL and plug-in inputs. Using this analogy as a conceptual starting point, a complete understanding of Aurora is gained by noting that Protein’s output, an integrated executable software application that we metaphorically compare with a woven textile, is not the final output result of the Aurora platform. Rather, Protein’s output is an intermediate result that is executed by the platform in order to produce the final result.

Consider that the Aurora platform’s final output is analogous to a gentleman’s dress suit manufactured from a bolt of cloth according to the specifications of a sewing pattern. In effect, Protein’s intermediate program can be metaphorically equated to the bolt of cloth and sewing pattern, and the process of executing this intermediate program is analogous to tailoring the suit.

Generating Code from Visual Models

The Aurora platform is conceptually different from other software modeling strategies by virtue of the fact that it is an architectural fusion of a high-level modeling language (CSDL), a runtime engine specifically designed to generate code directly from CSDL models, and an intuitive graphical interface for visualizing and creating CSDL models.

By comparison, popular modeling languages such as Unified Modeling Language (UML) provide modeling facilities useful for describing the overall functionality of a software application and the interactions between functional subsystems within the application. However useful, UML was designed to capture requirements and automatically produce specifications, not to serve as the basis for code synthesis.

Referring back to the Jacquard loom analogy made in the previous subsection, current UML tools metaphorically automate the production of a crafter’s cross stitch pattern – a template printed on a piece of cheese cloth through which a software developer must manually weave thread in order to realize the design. The problem here is that once the process of manual stitching begins, it becomes increasingly difficult to make changes to the UML model without invalidating the metaphorical cross stitch pattern – a persistent and vexing problem that is unavoidable when designing and producing complex software systems that must satisfy evolving business and technical constraints.

The Aurora platform directly addresses this deficiency by automating both the specification of the cross stitch pattern and the weaving process itself. Whereas software development projects based on UML models must necessarily minimize the likelihood of specification changes through an exhaustive up-front process of requirements gathering, projects modeled with Aurora’s CSDL are inherently mutable. This dramatically reduces the impact of specification changes.

Additionally, because Aurora enforces a regimen where CSDL models and the Protein plug-ins used to realize them must evolve in tandem, at no point in the process do the models and underlying implementation diverge. This is not to imply that specification changes are entirely free of cost but to rather assert that Aurora provides a holistic combination of methodology and runtime service support that significantly mitigates this cost.

Parenthetically, Aurora also allows for possibilities not even envisioned for future UML-based development tools. Namely, Aurora provides for a high degree of logical separation between CSDL and the underlying implementation of the software generators. This allows for a given CSDL specification to be paired with a unique set of Protein plug-in components by superposition in order to produce alternate implementations of a single common high-level specification.

Again referring back to the Jacquard loom analogy, this process of superposition is roughly analogous to replacing the set of thread shuttles installed in a Jacquard loom without replacing the rotating belt. For example, one might employ this technique in order to elegantly produce multiple versions of a software application targeted at different operating systems, computing platforms (e.g. workstation vs. PDA), or to handle localization issues.

CSDL Modeling Language

This section explains the motivations behind the design of Aurora’s Component Software Description Language (CSDL) and introduces the reader to CSDL’s approach to describing and partitioning component software systems into re-usable building blocks using hardware circuit metaphors borrowed from electrical engineering.

CSDL is a standards compliant extension of the WC3 1.0 Extensible Markup Language (XML) specification designed to be highly flexible, and widely portable. Encapsule Systems, Inc. plans to promote CSDL as an open specification for describing component software applications – the subject of another public white paper containing CSDL’s XML schema and a complete discussion of the language’s semantics.

Technical readers interested in examining annotated CSDL code samples in advance of the availability of the CSDL technical white paper are referred to the Further Information section of this introductory white paper for download instructions.

Extensible Markup Language (XML)

CSDL is specified using Extensible Markup Language (XML). The decision to use XML as the basis for CSDL was made for several reasons including:

  • XML is well understood and widely accepted as the standard mechanism for describing and exchanging data between software processes.
  • XML’s inherent extensibility makes it an ideal choice for developing new description languages (as opposed to general purpose languages that require sophisticated expression analysis and the ability to conditionally control the flow of program execution through branches, loops, etc.)
  • Ready availability of high-quality, open source parsers and validation technologies.
  • Compatibility with other third-party XML-based development tool suites.
  • Ready availability of standard XML technologies such as XML-RPC and SOAP that allow CSDL component models and application specifications to be easily transmitted via HTTP over the established IP network infrastructure.

CSDL is used to describe interconnection relationships between software components and is based on a hierarchical, and highly cross-referenced underlying data model. Care has been taken to specify CSDL using only a small subset of XML language features defined in the WC3’s XML 1.0 specification – a deliberate design decision made to:

  • Keep CSDL’s XML schema very simple.
  • Minimize reliance on complex validating XML parsers and memory-intensive DOM frameworks that are ill-suited for light-weight, high-speed server applications.
  • Minimize the difficulty of aggregating multiple CSDL data streams into a single CSDL data stream.
  • Facilitate high-speed retrieval of CSDL-format component models and application specification data from XML-native-format databases, and directory server technologies such as LDAP.

Electrical Circuit Metaphor

The specification of CSDL, in fact the entire design philosophy upon which the Aurora platform is based, is motivated by the observed success of the hardware industry’s use of advanced Computer Aided Design (CAD) tools, standard cell libraries, circuit characterization, analysis, simulation, and verification technologies.

VLSI design is very complicated and the cost of making mistakes is extremely expensive both in terms of dollars and time. Consider the fact that the production of a single wafer that yields functional die suitable for test and characterization requires a capital outlay of tens of millions of dollars and fabrication of a new semiconductor device take many months (baring process problems). Because of this, VLSI designers leave little to chance investing heavily in tools and technologies to make sure the design is correct before tape out.

By contrast, the production of software has historically suffered from a lack of such formalism due in no small part to the perceived ease of re-implementing and redistributing software that doesn’t work correctly. There was a time when the cost of formally engineering software projects could simply not be justified given the relative simplicity of software systems and the ease and speed with which they could be iteratively refined.

As hardware innovation continues to track Moore’s Law, software lags behind growing more and more expensive to produce, test, and maintain. It is estimated that fewer than 10% of professional software developers use any formal design modeling tools at all. The remaining 90% relies on white boards and cocktail napkins. Given this statistic, it is little wonder that the software systems we demand so much of are full of bugs and cost hundreds of dollars per line of code deployed.

Aurora’s CSDL is used to describe software components, and software applications defined in terms of interconnected networks of these components. It is an attempt to emulate the rigorous specification methodologies employed so successfully by VLSI and board-level hardware design engineers and borrows its nomenclature directly from the hardware world.

The following sections introduce CSDL’s nomenclature starting with the least abstract unit of description, the software component, and working up the abstraction hierarchy to explain how CSDL is used to encode specifications for complete software applications.

Component Specification Elements

As explained in the previous sections, the Aurora platform’s Protein runtime parses CSDL in order to dynamically assemble an executable application from pre-defined, low-level, binary plug-in software components. These plug-in components are called processors. This section discusses processors and explains how CSDL is used to describe them.


At the bottom of CSDL’s description hierarchy is the processor specification element. CSDL processor specifications serve to declare the set of pre-compiled, binary software plug-in components available to Protein at runtime.

Processors are immutable and opaque and are specified in terms of a Universally Unique Identifier (UUID) code (analogous to the part number branded on the top of a hardware chip), and a set of public interfaces that they implement in order to exchange data with other processors.

CSDL places no restrictions on what can be declared as a processor beyond mandating that the binary plug-in described provides implementation for a base set of interfaces required by the Protein runtime service (these interfaces are declared in the Aurora Component Software Development Kit (CSDK)).

CSDL’s use of a UUID to identify a processor inherently decouples the processor specification from the actual binary plug-in. This independence between CSDL processor specification and the binary plug-in that implements the processing algorithm allows for the substitution of alternate, pin-compatible plug-ins – a hugely successful technology re-use strategy adopted by the hardware industry decades ago.


Pins are CSDL specification elements used to demarcate input and output connections. Pins are specified in terms of a UUID, a direction (a pin can function as a data source, or sink – never both), and a data type binding.

In order to expose a pin, a binary plug-in must provide implementation for a pin interface (defined in the CSDK). It is a common point of confusion to equate CSDL pin descriptions directly with what object oriented programmers normally think of as interfaces.

Pins are not interfaces in the object oriented programming sense of the word. Rather they represent a possible point of connection to another pin. When pins are connected, the actual semantics of the resultant connection is determined by the data type of both pins. Pins can only be connected if their CSDL-declared data types are identical, and no more than one output pin (signal source) is driving a set of input pins (signal sink).

CSDL’s use of a UUID to identify the data type of a pin inherently decouples a pin specification from the data marshaling semantics implemented by the pin’s associated type handler. This contributes to CSDL’s portability and is discussed in greater detail in the next section.

Data Types

Data types are CSDL specification elements that serve to declare type handlers – a special type of binary plug-in component used by the Protein runtime to marshal data across pin connections.

CSDL places no restriction on the actual data that a type handler plug-in marshals beyond mandating that a type handler provide implementation for a base set of interfaces required by the Protein runtime (these interfaces are defined in the Aurora CSDK).

CSDL has no facility whatsoever for describing the actual data marshaled by a plug-in type handler. Rather, CSDL is used only to enumerate the available type handlers and associate each with a UUID that is used to bind pins to data types. This abstract decoupling of CSDL data types from the binary plug-in type handlers contributes to CSDL’s portability and provides for extreme flexibility.

Cell Specification Elements

The previous section introduced CSDL’s processor specification element, a mechanism for registering binary plug-in processors with the Protein runtime service. Protein does not place any restrictions on the algorithm(s) that a processor is allowed to implement – virtually any algorithm can be shoe-horned into a processor and registered with Protein via a CSDL processor specification. Typically though, processors implement rather simple algorithms that can be re-used in many different contexts.

As eluded to in the previous Electrical Circuit Metaphor section of this paper, CSDL attempts to emulate successful techniques used by hardware design engineers. Of particular interest are standard cell libraries employed by VLSI designers. In VLSI CAD tools, cells are re-usable models that describe interconnected networks of low-level components. Standard cell libraries are collections of these re-usable models that VLSI designers use as building blocks when designing new chips. Semiconductor companies invest a lot of time and money building and maintaining their cell libraries as they have been proven to be a useful and cost-effective way to capture and re-use design knowledge.

CSDL’s analog to the VLSI cell is called a module, the topic of the next subsection.


CSDL provides facility to describe logical entities called modules. Modules play a central role in the Aurora platform: they are the recombinant building blocks from which applications are built, and they are the entities that are visualized by the Fiat graphical user interface runtime.

Outwardly, modules are conceptually similar to processors by virtue of the fact that they expose pins (typically) and are branded with a UUID. Unlike processors however, modules do not correspond directly to binary plug-in components. Rather, they are logical constructions defined in CSDL that are used to specify re-usable, interconnected networks of processors in a manner analogous to the way VLSI cell models are used to specify re-usable assemblages of low-level circuit logic blocks.

Generally speaking, modules can be conceptualized in terms of object oriented programming nomenclature; a module’s pin-out is roughly analogous to an object’s interface, and a module’s contained network of interconnected processors is roughly analogous to an object’s implementation. These comparisons are useful but do not tell the entire story however.

Unlike objects in object oriented programming languages, modules do not have to have interfaces – at least not in the canonical sense of the word. Stated alternately, modules may be defined that do not expose pins. This notion will seem queer to object oriented programmers but modules are not objects – they are interconnection patterns. A module that exposes no pins is simply a self-contained interconnection pattern that does not rely on external input and does not produce external output. The use of such a construction will be made clearer later when we discuss how modules are connected together to build an application.

Another initially counterintuitive property of modules is that their internal interconnection networks do not necessarily have to contain processors. A module that does not interconnect processors is essentially a wiring harness that provides some mapping between its input pins and its output pins. Parenthetically, it is meaningless to define a module that has no exposed pins, and which does not interconnect processor(s) – such a construction would in essence define a useless wiring harness that can’t be connected to anything.


Modules themselves are interconnected to form an application using another logical construction of CSDL called a socket. Like the physical sockets found on circuit boards, CSDL sockets expose pins and are used to demarcate partitions in an interconnection network. As previously described, modules are, by definition, specifications of interconnection patterns. In addition to zero or more processors, a module may optionally specify zero or more socket(s) in its CSDL interconnection specification to demarcate possible connections to other modules. Like modules, sockets are also visualized by the Fiat graphical runtime.

Just like physical sockets on a circuit board, CSDL sockets are “keyed” to prevent the insertion of non-pin-compatible modules. This aspect of CSDL is discussed further in the Cell Interfacing Element section below.


Earlier in this paper the reader was introduced to Fiat – Aurora’s graphical user interface runtime service. It was explained that Fiat is based on a graphical metaphor of interlocking tiles. These tiles actually correspond to CSDL module and socket entities.

CSDL provides facility for specifying a library of polygonal shapes and colors that are bound to modules and sockets in their CSDL specifications. This information is ultimately used by Fiat to draw its graphical visualization of a software application.

Cell Interfacing Elements


CSDL provides facility for describing a logical entity called a bus. A bus is specified in terms of a socket, a module that can be inserted into that socket, and a mapping of socket pins to module pins.

The set of all busses defined in a CSDL specification establishes a high-level interconnection schema that is queried to determine if a given module can be inserted into a given socket and which pins will be connected upon insertion.

Multiple busses can be defined for a given socket in order to establish mappings to multiple pin-compatible modules. This is a high-level form of polymorphism that allows CSDL specifications to describe entire classes of software applications – a feature that makes CSDL ideally suited to encapsulating and exposing domain-specific integration knowledge.

Circuit Specification

Component software applications specified in CSDL are called circuits. CSDL provides facility for defining a circuit in terms of an interconnected network of modules.

CSDL / Protein Runtime Interaction

It is the responsibility of Aurora’s Protein runtime service to transform a CSDL circuit specification into a low-level, interconnected network of processors that can be programmatically analyzed to load the appropriate binary plug-in components, and ultimately execute the application.

Realizing that this transformation takes place is important to understanding the overall role of CSDL in the Aurora platform and its relationship to the Protein runtime service. It is not however important to understand how Protein actually accomplishes this task.


The architecture of the Aurora platform is inherently customizable via three primary extension mechanisms each targeted at a different audience. This section describes each of these three extension mechanisms and explains when they should be properly employed.

Extensibility Model

Horizontal Extension via CSDK

The Aurora platform’s Protein runtime service generically operates on interconnected networks of binary plug-in components called processors, and employs binary plug-in components called type handlers to marshal data along the interconnection paths.

Horizontal extension of Aurora is accomplished by writing custom processor and type handler plug-ins according to rules enumerated in the Aurora Component Software Development Kit (CSDK).

The Aurora CSDK is a C++ template library that defines interfaces that processors must implement in order to interact with the Protein runtime and to expose pins. The CSDK additionally defines interfaces that type handlers must implement in order to interact with Protein and manage the transport and marshaling of data across interconnections between pins.

Type handlers are extremely trivial to implement. As their only responsibility is to shield Protein from any direct knowledge of the data types marshaled between interconnected pins, type handlers are little more than a C++ class with a copy constructor.

Processors are slightly more complicated to implement as they must provide implementation for several callback entry points actuated by Protein’s program sequencer. Conceptually the interaction between Protein and its plug-in processors is extremely simple and follows a well-documented state machine model that should be readily understandable even to novice C++ programmers.

As an aside, one particularly interesting type of processor worth mentioning wraps an alternate language interpreter, or scripting host within the C++ shell mandated by the CSDK allowing the actual implementation of the processor to be implemented in a language other than C++. Depending on the importance of performance, this might be a good choice for organizations attempting to shoe-horn code written in an alternate language into Aurora’s processor plug-in model, or to organizations considering an embedded server deployment of the Protein runtime.

Vertical Extension via CSDL

One of the most powerful extension mechanisms available to developers working with the Aurora platform is the ability to define new module specifications based on a library of existing processors. This process effectively reduces to mapping out some useful interconnection of processors and sockets and exposing it to Protein via a CSDL module specification.

Recalling that a module specification is a logical construction wholly defined in terms of CSDL, extension via this mechanism does not require the development of plug-in processors or type handlers. Rather it leverages previously written CSDL processor specifications and type specifications that abstractly represent existing processors and type handler plug-ins.

Developers of new module specifications do not need to have a software engineering credentials as there is no programming involved – at least not as we traditionally have come to understand it. Any person capable of drawing a high-level block diagram on a white board can easily conceptualize the process of module specification in CSDL.

Recall that a CSDL software application specification is called a circuit and that circuits are defined wholly in terms of modules. As was discussed previously in the introduction to CSDL in the Theory of Operation section of this white paper, a collection of module specifications defines an entire family, or class, of circuits that can be defined by combining modules in different ways.

So called vertical extension of Aurora involves the creation of new CSDL module specifications and the addition of new CSDL bus specifications that define possible inter-module connections via sockets. Vertical extension increases the size of the set of circuits that can possibly be defined by the end user and is the primary extension mechanism employed by OEM’s and ISV’s to customize Aurora for deployment to their customers. This subject is discussed in greater detail in the Usage Model section of this white paper.

Depth Extension via Protein Customization

The third extension mechanism, so called depth extension, involves making changes and/or additions to the Protein runtime service itself. Extending Protein requires a detailed understanding of its runtime and internal data models in addition to direct access to the Aurora platform source code.

It is atypical that an OEM or ISV needs to worry about extending Protein – the Aurora platform’s other two extension mechanisms, programming  new plug-in processors and type handlers (horizontal extension), and authoring new CSDL module and bus specifications (vertical extension) typically adequate for developing most Aurora-based applications.

Particularly advanced applications of this technology that might merit extending the depth of the Aurora platform include:

  • Adding advanced analytical routines
  • Implementing custom optimization algorithms
  • Enhancing visualization support
  • Extending the CSDL specification

Customers seeking such platform extensions will typically commission Encapsule Systems, Inc. to do the actual implementation although large organizations may elect to license the platform source code and do the work using their own internal development resources.


Host Platform

Aurora’s CSDL is inherently portable because it is expressed in terms of industry standard, platform independent XML.

Aurora’s Protein runtime service has been carefully crafted using ANSI-compliant C++ and relies only on a small, cleanly segregated, abstraction layer through which it gains access to the hosting platform’s local file system, network transport, process creation, thread, and synchronization mechanisms. Encapsule Systems’ reference implementation of Protein was developed on Microsoft Windows 2000 using the Intel C++ compiler. This compiler was selected over alternatives because it works well with Microsoft-supplied development tools for Windows as well as being available in a native Linux version. Protein makes extensive use of advanced C++ constructs such as templates that are not well supported by all C++ compiler vendors. This means that Protein is unlikely to be easily portable to unique platforms for which mature C++ compilers are unavailable.

Encapsule Systems’ reference implementation of the Fiat graphical user interface runtime service is also written in C++ but is highly dependent on Microsoft’s Win32 API. Specifically, Fiat is implemented using Microsoft’s excellent Windows Template Library (WTL) – a C++ template library replacement for the Microsoft Foundation Classes (MFC).  Care was taken during the development of Fiat to carefully abstract its interactions with the Protein runtime using a controller / view model in order to minimize the task of porting Fiat to UNIX X Windows, Apple’s OSX Aqua, etc.

Target Platform

As discussed earlier in the Theory of Operation section of this white paper, Aurora’s Protein runtime operates on binary processor and type handler plug-ins. Because CSDL places no restriction on the actual processing and data transformation performed by processor plug-ins, care must be taken when developing processors to insure their portability.

Processors and type handler plug-ins supplied by Encapsule Systems segregate portable ANSI C++ implementation from platform-dependent implementation by employing a thin abstraction layer. It is recommended that third-parties also employ this technique when developing processors in order to insure maximum flexibility. Additional discussion of this topic and source code for this abstraction layer is provided in the Aurora Software Development Kit (CSDK).

Embedded Deployment Options

An interesting class of Aurora-based application discussed in greater detail later in the Applications section of this white paper, embeds the Protein runtime service in some hosting software application other than a graphical user interface. Such deployments typically leverage Protein to implement customized application servers, and/or to provide generic infrastructure support at runtime.

Protein is written entirely in C++ and is extremely fast. Additionally, Protein’s runtime heap memory requirements are typically negligible compared with the heap memory required by its plug-in processors – overhead that would be incurred by a C++ application regardless of its use of Protein anyway.

Protein however implements an extremely complicated algorithm that is not easily compressed into a small footprint. It may be possible to reduce this footprint at the expense of some runtime performance but generally speaking, it is not particularly well suited for deployment on small portable devices such as cell phones, or PDA’s.


Unlike other software development tools that model software systems in terms of object oriented programming language constructs, the Aurora platform models software systems at a significantly higher level of abstraction. Aurora’s processor abstraction is very similar to what object oriented programmers think of as an object. As discussed in the Theory of Operation section of this paper, processors are the base of Aurora’s CSDL abstract specification model upon which the higher-order module and circuit specification constructs are based.

At the risk of oversimplifying the comparison, current software modeling methodologies and software development tools concentrate of the semantics of objects and their interactions. By contrast, Aurora concentrates abstractly on interactions among groups of interconnected patterns of objects – so called modules without regard to semantics. This distinction may seem subtle at first, but understanding that Aurora is related to yet fundamentally different than object oriented programming is essential to grasping how the platform is applied to solving real world software engineering problems.

Aurora and Object Orientation

Object oriented programming languages are extremely valuable.  However useful, object oriented programming metaphors none the less represent a point on an evolutionary curve that traces its lineage back to machine code programming. Unfortunately this evolution has never been motivated by an effort to bridge the expanse between the generalized software model proposed by Alan Turing and metaphors that humans employ to conceptualize the world. Rather it has been motivated by efforts to abstract Turing’s model, an effort that the late great John von Neumann would have characterized as a “… waste of valuable computation resources to perform clerical work.”

With apologies to von Neumann, history has proven the value of using computers to perform the clerical work of compiling programs. However, the fundamental gulf between Turing’s model and concepts that humans find natural and intuitive persists to this day. The reader is referred to the Usage Models section of this white paper for a brief philosophical digression that further expounds on this point.

The Aurora platform by no means dismissed the usefulness and utility of the object oriented programming metaphor or the software development tools used to make it more manageable. As a point of fact, implementing Aurora’s Protein runtime service would not have been conceptually feasible without the powerful expressiveness of Bjarne Stroustrup’s C++ programming language – specifically its ability to express algorithms generically using template meta-programming techniques.

It is however asserted that the object oriented programming metaphor is not adequate for describing software systems at a level of abstraction that is readily understandable to humans. Simply put, the complexity of large object oriented software systems simply overwhelms the human mind. Our brains are optimized for pattern recognition and association, not for conceptualizing Turing models executing on von Neumann-inspired computing machines.

The Aurora platform’s use of a geometrically inspired building block metaphor to expose object interconnection patterns (modules) is intended to abstract software systems in a manner that takes advantage of our innate ability to associate shapes with concepts and recognize visual patterns quickly. It’s the responsibility of Aurora’s Protein runtime to handle the clerical work involved in transforming these representations back into Turing models.

Given this background, the following subsections discuss the applications of the Aurora platform and explain how its conceptual model augments the expressive power of object oriented programming languages to dramatically simplify the task of modeling and generating software applications.

Generic Application Infrastructure

As was previously discussed in the Aurora Platform Summary section of this white paper, Aurora’s Protein runtime service processes CSDL specifications to dynamically assemble executable software applications within a virtual machine. Recall from the Theory of Operation section that the virtual machine application assembled by Protein is built up from low-level, binary processor and type handler plug-ins described by the CSDL specification.

The overall architecture of the Aurora platform was designed for assembling and executing generative programming algorithms within the Protein virtual machine. Protein itself is however completely generic – it doesn’t know and it doesn’t care about the semantics of the application assembled and running in its virtual machine. Protein only concerns itself with the job of assembling the virtual machine application, and sequencing its execution.

The point is that Protein can be used to assemble any type of software application given an appropriate CSDL specification and set of supporting plug-in processors and type handlers. Generative algorithms just happen to be one class of application for which Protein is particularly well suited.

Generative programming is a relatively new and emerging discipline within the field of computer science that is not yet widely understood. Before discussing generative applications in any great detail, we will first examine how Aurora can be used as generic infrastructure glue in several more traditional software application categories. These non-generative examples, significant in and by themselves, will help the reader gain a better understanding of how the pieces of the Aurora platform work together. This will be important later when we again take up the topic of creating generative applications in the Code Modeling and Synthesis section of this paper.

Note: Non-generative applications of Aurora such as the ones described in the following subsections employ the Protein runtime service to assemble and execute a final target application. That is, the application assembled in Protein’s virtual machine is the final output of the platform. This differs conceptually from generative applications that employ an additional stage of indirection in which the execution of the application assembled by Protein generates the final output of the platform. Given that Protein employs binary processor and type handler plug-ins to assemble an application, such non-generative applications must necessarily be deployed on a platform that adheres to the portability constraints described in the Customization section of this white paper.

Mutable Signal Processing Systems

Perhaps the most direct conceptual application of Aurora’s Protein runtime service is providing generic infrastructure glue for multimedia and signal processing applications. Multimedia and signal processing applications depend inherently on time correlated data streams that pass through a network of interconnected transform algorithms, or filters as they are often conceptualized (a well known class of filters are the COde/DECode or CODEC plug-ins common to most media player applications).

One of the most challenging aspects of developing commercial software that relies on signal processing algorithms is implementing the code responsible for managing the streams of data that pass between the filters. The demands of deploying these applications in production environments typically requires the development of extremely flexible infrastructure code that allows for the dynamic substitution of filters, and the dynamic adjustment of interconnection network topologies. Developing this type of infrastructure is extremely challenging, and is a frequent source of product malfunction due to the inherent difficulty of simulating all real-world usage scenarios in a quality control lab.

An excellent example of this type of application is H.323 IP-based video teleconferencing applications that must dynamically adjust the selection of audio and video compression algorithms at runtime in response to changing host CPU load and available upstream network bandwidth factors. Multimedia content creation applications, for example non-linear video digital editing systems, similarly require inherently reconfigurable internal software infrastructure in order to choreograph the sequencing of video filter and transform algorithms.

Use of Protein in signal processing applications eliminates the need to write code to manage interconnections between filters. Recall that Protein is inherently based on interconnection patterns, designed specifically to handle the demands of dynamically changing network topologies, and implements a highly-optimized internal program sequencing mechanism suitable for high-speed real-time data processing.

Filters as discussed in this context, are effectively signal processing algorithms implemented as Aurora processor plug-ins that have been exposed to the Protein runtime using CSDL. The data types transacted by filters are abstracted using Aurora type handler plug-ins and similarly exposed using CSDL.

In this application, Protein itself is embedded in the signal processing application that is responsible for programmatically generating CSDL-format application specifications that model the desired signal processing functionality. By driving the process of programmatic CSDL generation with custom state machine logic, a signal processing application can easily implement incredibly complex dynamic runtime behaviors.

Mutable Control and Monitoring

A difficult software architecture problem made significantly simpler by the application of Aurora’s Protein runtime service is the control and monitoring of peripheral hardware devices installed on a host platform (e.g. custom communications and networking devices installed on a PCI bus).

Developing host driver layer software for such peripherals is notoriously difficult. It is typically the case that software running on the peripheral device’s hosting computer is responsible for dynamically monitoring and controlling what the peripheral is doing moment to moment. This requires developers of such systems to design elaborate software mechanisms that implement state-based feedback loops between peripheral subsystems and host software.

Simply writing software to access such a peripheral is a challenge that requires detailed understanding of arcane subjects such as interrupt service processing, direct memory access transfers, threads, synchronization, and operating system scheduling semantics. Because of this complexity, host driver layer software typically exposes a fairly fine-grained access interface to host application level code. This forces a great of complexity up into the host application that must use the fine-grained access interface exposed by the host driver layer to implement the required state-based feedback and control loops required to monitor and control the peripheral.

It is a maxim of good software engineering that dependencies between software layers be minimized. Unfortunately, this ideal is seldom achieved in this type of application due to the time and economic pressure exerted on software developers to deliver something that works quickly and not waste valuable resources engineering for contingencies that may not arise.

Aurora’s Protein can be used to considerably clean up the architecture of this type of system and provide an elegant, high-level access interfaces to applications level software programmers. This strategy involves encapsulating the typically fine-grained access interfaces traditionally exposed by the host driver in Aurora processor plug-ins. These processors effectively function as logical proxies for various hardware subsystems resident on the peripheral device.

A complete discussion of this type of application is beyond the scope of this paper but briefly, the general idea is that the host driver layer exposes high-level module constructs to the application defined in terms interconnections of proxy processors and control processors. The host driver layer provides implementation for the proxy processors and the application provides implementation for the control processors specified by the driver layer’s exposed module definitions. Parenthetically, this is an extremely succinct mechanism for exposing requirements to an application; the module definitions clearly specify what control processors the application must provide implementation for in addition to providing a great deal of high-level semantic information about the context in which they will be used.

Using the module definitions exposed by the host driver layer, the application then builds a CSDL circuit specification that is passed to Protein. Protein in turn uses this circuit definition to build an internal network of proxy and control processors that serves to implement the required state-based feedback and control loops between the host application and the peripheral device. This provides for an extremely clean partitioning that significantly eliminates direct coupling between hardware, driver control layer, and application.

Using this strategy, one could for example replace the peripheral hardware entirely at the cost of rewriting proxy processors. Or make changes to the implementation semantics of the host driver layer by updating the module definitions (one might use this strategy to replace a proxy processor with a purely software based processor to move functionality from the hardware to the host as a cost reduction measure for example). Or make sweeping changes to the functionality of the applications level software by making changes to the CSDL circuit definition passed to the Protein runtime service.

As a final comment on this particular type of application, it should be stressed that the choice of a PCI bus in our hypothetical example is arbitrary. At a high level of abstraction, the technique of employing Aurora proxy processors can be applied to abstract inter-subsystem communications links in any distributed system. For example we might use a proxy processor to encapsulate Simple Access Object Protocol (SOAP) messaging across the Internet instead of PCI bus transactions.

Application Servers

Standardized XML-based protocols for discovering remote software services (Universal Discovery Description and Integration (UDDI)), querying their interface semantics (Web Services Description Language (WSDL)), and making function calls against their interface(s) (Simple Object Access Protocol (SOAP)) are quickly displacing older Object Request Broker (ORB) architectures such as CORBA and DCOM for developing distributed Internet applications. The software industry as a whole has embraced XML and so called web services protocols based on the XML standard because they are inherently portable, inherently extensible, work well over existing Internet Protocol (IP) networks, and rely on time tested and true lower-level protocols for data transport (e.g. Hyper Text Transfer Protocol (HTTP)).

For all the hype surrounding web services, essentially they are just another scheme (albeit an unusually well conceived and widely adopted scheme) for implementing communications links between software applications. While the existence of standards-based inter-application communications protocols is a boon for software application developers, the emergence of web service protocols has served to underscore the inherent complexity of building, deploying, and maintaining distributed software applications by elevating the importance of this class of software application.

Web services are widely heralded as a technology that helps developers integrate distributed web applications. This is true; web service protocols substantially mitigate portability and interoperability problems and greatly simplify the task of setting up inter-application communication links. However, web service protocols are communication protocols, not integration protocols. It is a common misconception that web service protocols reduce the effort required to develop the software applications that produce and consume the data that is transported via these protocols. They do not; developing these applications is a separate problem.

The following subsections discuss the use of Aurora’s CSDL (effectively an XML-based integration protocol) in conjunction with web service protocols to holistically address the design, implementation, integration, deployment, and maintenance of distributed web applications.

An in-depth treatment of these topics is beyond the scope of this white paper. However, the discussions that follow are a reasonable introduction and are an excellent vehicle for exploring the application of the Aurora platform in more detail.

Embedding Protein in a Web Server

The basic strategy for combining web service technologies with Aurora involves embedding Protein within a web server that leverages the runtime to service requests issued from a remote host on the Internet. Recalling that Protein implements a generic application server architecture, embedding it within a web server process transforms the server into a dynamic web services hub.

As was motivated previously in the Aurora Platform Summary and Theory of Operation sections of this white paper, Protein relies on collections of binary processor and type handler plug-ins and CSDL-format module and circuit specifications to model, assemble, and execute component software applications. Embedded in a web server, Protein is used to assemble services that the server exposes via web service protocols.

Aurora’s application to the overall problem of distributed web application production does not eliminate the need to develop libraries of binary plug-ins using object oriented programming languages and development tools. It does however provide a high-level framework for abstracting services built from re-usable components that provides several compelling benefits compared with conventional approaches to distributed systems engineering.

Static Web Service Integration

The simplest configuration of Protein embedded in a web server initializes the runtime service with a complete CSDL-format application description(s) that wholly describes the component software application(s) that are to be exposed via web service protocols. In this configuration, Protein is completely encapsulated in the web server and its runtime behavior is effectively a server implementation detail. In other words, remote clients interacting with the web server interact with some number of services exposed via standard web service protocols and Protein is completely hidden.

The primary benefit of this strategy is that the services modeled using CSDL and implemented with Protein benefit substantially from Aurora’s hierarchical software abstraction model, implicit logic re-use metaphor, powerful runtime integration engine, and high-level graphical visualization and modeling tools support. In enterprise software parlance Aurora supports an external Service Oriented Architecture (SOA) model through its implementation of a Service Oriented Integration (SOI) runtime engine that exposes Business Processes (effectively modules in Aurora’s nomenclature) graphically via the Fiat graphical runtime service to provide direct and natural support for Business Process Modeling (BPM) and Business Process Integration (BPI).

Dynamic Web Service Integration

Wholly encapsulating Protein within a web server as in the previous example is extremely useful. However, it only begins to expose the full range of options made practical by exposing the Protein runtime service itself via web service protocols. A generalized strategy that takes maximum advantage of the Component Software Description Language’s (CSDL) ability to expose entire classes of software application via recombinant modules and leverages Protein’s ability to transform circuits (interconnected networks of modules) into executable services involves exposing several facets of Protein’s controller interface via web service protocols.

Unlike the first example where we initialize the embedded Protein runtime service with a complete CSDL specification in order to implement a specific set of service implementations, the generalized approach is to initialize Protein with a partial CSDL specification containing only the information necessary to initialize Protein’s Component Software Librarian subsystem. This registers the set of binary processor and type handler plug-ins available to the runtime and additionally defines the set of modules and buses that together determine the potentially large set of application services that can be specified via a CSDL circuit description.

Exposure of the embedded Protein runtime via web service protocols requires two separate services: a module directory service, and a generic service hub. The module directory service accepts a remote request with no input parameters and returns the list of CSDL modules and busses registered with the Protein Component Software Librarian. The generic service hub accepts requests that contain a CSDL circuit specification and a set of input parameters and values that are passed to the actual service assembled and executed by the embedded Protein runtime to produce a set of output results that are returned to the requestor.

This deployment scenario allows a remote host to query our web server for its capabilities (wholly specified as a list of available CSDL module and busses) and subsequently issue specific requests against our server that are satisfied dynamically. Whereas in the first example we completely specified the set of supported services in advance, in this case it is left to the requesting client application to decide exactly what services it requires; the requestor can make requests against any service that can be described in terms of a CSDL circuit specification.

There are several compelling advantages to assembling services dynamically in response to incoming requests using this so-called generic service hub deployment strategy.

  • You don’t need to know or predict exactly what services will be required a-priori.
  • The evolution of client software systems is decoupled from your service infrastructure and can evolve independently within high-level constraints imposed by the module directory service.
  • Capabilities are easily added and removed from the server through CSDL updates to the Protein component software librarian. Updated capabilities are reflected through the module directory service and instantly available for use by client applications.
  • The implementation details of your server remain hidden; the module directory service only reports only enough information about the registered modules and busses to allow a client application to assemble a valid CSDL circuit specification that it embeds in requests. The underlying CSDL specification of modules, busses, processors, and types is inherently mutable; changes to the CSDL used to initialize Protein’s component software librarian can be made without changing the semantics of the system.
  • At the lowest level, binary processor and type handler plug-ins can be revised or replaced to affect implementation changes independent of the high-level CSDL representations. For example a processor might be replaced to a proxy processor (as described in the previous Mutable Control and Monitoring application example) in order to distribute implementation of a service over a server cluster as discussed in the following subsection.

Scalable Server Clusters

One of the largest challenges facing the designers of distributed software applications such as those deployed using web service protocols is scalability; the ability to quickly respond to changing demands placed on the software and hardware subsystems used to deploy distributed applications. Market pressure to deploy quickly often overrules the inclination of software development staff to engineer for these contingencies in advance.

At this point it should be clear that services implemented by Protein are inherently mutable and extensible through the CSDL specification mechanism regardless of if Protein is embedded using a static (services specified in advance) or dynamic (services integrated on demand) deployment strategy.

At a high-level of abstraction, scaling a distributed software application requires moving functionality from a software application executing on one platform into another application executing on another platform. The ease with which this can be accomplished without disturbing the internal plumbing of the original software application determines how easy or hard it is to scale a distributed software application.

In the Mutable Control and Monitoring application example presented earlier in this paper, the concept of proxy processor was introduced and discussed in some detail. Briefly a proxy processor encapsulates communications between two disparate subsystem contexts that function as parts of a larger distributed system. Stated alternately, proxy processors are logical placeholders for remote subsystems in a distributed system that follow all the same semantic rules as a normal, non-proxy processors.

Given this background, scaling services implemented by the Protein runtime involves identifying resource intensive operations implemented by standard binary processor plug-ins and moving their implementation to another server using the proxy processor technique. There are several methods of affecting the actual scaling: replacement of the effected binary processor plug-in with a proxy processor, making modifications to CSDL module specifications to reference the new proxy processor instead of the locally executing standard processor, or defining entirely new modules in terms of the proxy processor(s). There are situations where one of these deployment methods might be preferable over another but that’s beyond the scope of this discussion.

The important point to understand here is that proxy processors allow functionality to be moved around a distributed system with little or no impact on existing software defined in terms of a CSDL specification. Further this ability comes as part of the intrinsic architecture of the Aurora platform; that’s just the way it works.

Comparing this strategy to middleware architectures that rely on centralized brokers to dispatch objects and handle inter-object messaging generically, proxy processors can be seen to be a simpler and more efficient means to the same end that significantly reduces overall system complexity by decentralizing inter-subsystem communication and providing an intuitive mechanism for applying it only when, and where it is needed.

Given this background, a scaleable server cluster is implemented by broadly applying proxy processors to abstract inter-server communications links between some set of servers. Potentially every server in the cluster embeds Protein and employs various combinations of the techniques described in this paper to implement the services that is supports. Advanced deployments of multiple servers, each embedding the Protein runtime, might for example leverage a combination of these techniques to implement load balanced transaction servers.

Service Testing and Security Considerations

Testing of any software application, distributed or not, is an undertaking made difficult by the staggering number of possible usage scenarios supported by modern software. Distributed software systems are even harder to test because they are inherently more complex architecturally, the number of free variables (e.g. network congestion, server failure, mismatched software versions) is increased, and empirical test data must be gathered from potentially many servers and then correlated in order to validate correct operation.

In an industry driven by budget and time to market constraints, a strategy of vigorously testing key subsystems in isolation from the wholly integrated system, so-called unit testing, is adopted. Testing of the wholly integrated system is typically accomplished by driving the system with some known set of inputs, and comparing the outputs against expected results.

Although unit testing works fairly well, thorough testing of a fully integrated distributed software application is so complicated, human resource intensive, and time consuming that typically it is only practical to test a small subset of the possible usage scenarios. Tools that perform statistical sampling of program execution at runtime and tools that perform high-level source code analysis can be employed to gain an understanding of how much of the code is actually exercised during testing. However, so-called test coverage tools produce highly complex results that can only be accurately interpreted with assistance from development staff and many organizations simply do not have enough human bandwidth to do the process justice.

Directly related to testing is the issue of security. Speaking very generally, the definition of software security depends to a large extent on the purpose of the software application. For example a specific set of input conditions that crashes a software application used to draw pictures would likely be considered a routine implementation bug. However, a specific set of input conditions that crashes a software application used to guide a missile to its target is a security threat – not a merely an implementation bug.

Security auditing of mission critical software systems requires extremely thorough testing that takes into account all the possible combinations and permutations of system input conditions including the free variables introduced when a system is partitioned and distributed across multiple hosts on a network. Security vulnerabilities in a software system are virtually always discovered by subjecting the system to atypical or specifically incorrect sets of input conditions that expose holes in testing coverage. As evidenced by the steady stream of security-related patches issued by major software vendors for their flagship products, obtaining 100% testing coverage remains an illusory goal.

In order to understand how the application of the Aurora platform technologies simplifies the complex task of testing software, we need to examine the common failure points in software applications in some detail.

As previously discussed, the practice of unit testing isolated software subsystems is fairly effective; subsystems are well defined in terms of their inputs and outputs and are typically effected by far fewer free variables than the larger systems into which they are integrated. This significantly simplifies the task of creating test vectors, and insuring that these vectors exercise all the execution paths within the subsystem.

Why then do systems integrated from these well-tested subsystems fail? The answer is that one of the primary failure points in a large software application is the hand-crafted glue code that holds all the well-tested subsystems together. This is so because it is inherently difficult to exercise the glue code in isolation from the subsystems it integrates. As the overall system becomes larger and more architecturally complex, the task of testing the integrating glue code looks more and more like the vexing task of testing the overall application.

Software applications described in CSDL and implemented by the Protein runtime service have a distinct advantage however; Protein implements the integrating glue code generically and is itself a subsystem amenable to rigorous unit testing. Recalling our Jacquard Loom analogy from the Theory of Operation section of this paper, Protein is the metaphorical loom. Once the loom is thoroughly tested and works correctly, cloth can be woven with a high-degree of confidence that errors in the cloth are the result of errors in the loom’s rotating belt and/or inappropriateness of the set of thread shuttles installed – two factors that can be independently unit tested.

Using Aurora, specifically Protein, to provide generic runtime infrastructure for a software application then significantly mitigates the problems inherent to testing hand-integrated solutions by virtue of the fact the unit testing methodologies alone suffice to insure an extremely high degree of test coverage.

Service Maintenance and Technology Migration

As explained in the previous subsections, use of the Aurora platform’s CSDL and Protein runtime to generically integrate and execute software-based services greatly reduces the effort required to specify, test, and deploy production grade solutions. Architecturally, Aurora’s electrical circuit-inspired metaphor for partitioning software systems has the additional benefit of providing a great deal of logical separation between the various levels of abstraction defined by the CSDL specification.

This logical separation between CSDL abstraction layers protects investment in software development by providing for low-level re-use of component software, mid-level re-use of component software interconnection patterns, and re-use of high-level application specifications. CSDL’s high-level circuit specification construct is particularly useful as it serves as a useful reference to engineers assigned to maintain the deployed software application in order to fix bugs, add/remove features, and evolve implementation to take advantage of advances in hardware and software technologies over time.

Traditionally integrated software applications are difficult for maintenance staff to understand. If they’re lucky they may have a reasonably up-to-date specification document, or maybe even some UML models to guide them. More often than not however they must either directly involve the design engineers, or deduce the high-level partitioning details directly from the source code. Directly involving design engineers in the maintenance process is not desirable; design staff is usually immediately re-deployed to start work on other projects, may not accurately recount the specific details themselves without some time-consuming investigation of the source code, or may have even left to company after completing the project. Maintenance engineers are typically junior design engineers in training who are not yet proficient enough with high-level architectural design considerations to gain a deep understanding of the overall system partitioning directly from the source code.

In contrast, CSDL circuit specifications cleanly expose the high-level architecture of a deployed software system and can be used directly by junior maintenance staffers to understand the overall system quickly without involving the design engineering team. Maintenance tasks are typically carried out by making changes from the top of the CSDL abstraction hierarchy (the circuit specification) down. Major changes can be easily implemented entirely in CSDL.

In some cases, changes to the actual binary processor and type handler plug-ins is required, a job that requires the involvement of the design engineering staff. When the need for new plug-ins is discovered however a remarkable thing has happened: junior staff has motivated a complete set of requirements for the plug-in in terms of CSDL that a senior design engineer can just go ahead an implement without becoming deeply involved in the maintenance activity. Altogether, this process helps keep all engineering staff productive and focused on their primary job responsibilities.

Generative Programming Applications

One of the most important applications of the Aurora platform is using it to synthesize custom application code from high level models. Aurora platform uses generative programming techniques to model and synthesize applications code using a new model developed by Encapsule Systems that is introduced in this section.

Getting Started

Earlier in the Theory of Operation section of this paper it was explained that the Aurora platform can be used to model, integrate, and execute two broad classes of software application: generative software applications, and everything else. With respect to the Aurora platform, this distinction is meaningless – CSDL and the Protein runtime service that processes CSDL specifications are inherently independent of software application semantics. That’s the whole point; Aurora does not know the function of the low-level processor and type handler plug-ins that provide application functionality and by extension therefore has absolutely no intrinsic coupling with the semantics of the overall application. Generative application, non-generative application, or a combination of the two – it doesn’t matter; from Aurora’s perspective all are just interconnection networks with opaque semantics.

This point is emphasized to stress that generative applications do not rely on any additional Aurora platform facilities beyond those already introduced in previous sections of this white paper. Rather in order to produce generative programming applications using Aurora we need understand the semantic differences between generative and non-generative software applications and carefully explain how Aurora makes it easy to express these semantics in terms of CSDL.

With this background, we first introduce some of the basic tenets of generative programming, and then move on to discuss how Aurora is used to model generative programming application semantics.

A Brief Introduction to Generative Programming

Generative programming is a relatively new and emerging discipline within the field of computer science that endeavors to abstract generalized systems knowledge about a problem space, or domain, into a form that can be combined with a list of limiting constraints in order to programmatically create a customized software application that leverages domain knowledge to perform some specific, useful task. There are two main goals of this science:

  • Provide a natural and easy way to manage large amounts of generalized systems knowledge obtained from experts and provide a mechanism that allows it to be transferred to non-experts in a useful and succinct manner.
  • Automate the arduous process of hand-integrating custom software solutions – a process that, using non-generative methodologies, requires expertise in both the application’s targeted problem space and in software engineering.

To date, study of generative programming systems has largely been an academic research exercise but that is beginning to change. As software system complexity has increased, generative programming has begun to attract attention from industry and professional trade organizations such as the Association of Computing Machinery (the ACM held their first conference on Generative Programming and Components late in 2002).

Compilers vs. Generators:

High level language compilers leverage expert domain knowledge about a specific microprocessor architecture(s) to produce machine code given a succinct statement of limiting constraints expressed in terms of the high level programming language syntax and semantics. Compilers are therefore conceptually similar to generative programming systems but do not satisfy the definition because one of their inputs, expert domain knowledge about specific microprocessor architecture(s), is held constant and can not be modified by the user of the system. Generative programming systems by definition address the more general situation where the both the domain knowledge and the limiting constraints specifications are inherently mutable.

There are competing viewpoints about the best architecture and implementation strategy for building generative programming systems. Researchers working in this field typically fall into one of two philosophical camps; those working on imperative programming language techniques, and those working on functional programming language techniques.

Briefly, imperative programming languages rely on ordered sequences of instructions to express application semantics, whereas functional programming languages are concerned with expressing the overall semantics of an application using a single, large expression whose evaluation produces the final system output. These approaches differ conceptually in that the former motivates semantics from the bottom-up, whereas the later motivates semantics from the top-down.

Fully explaining and contrasting these approaches in any detail is beyond the scope of this white paper. We note however that neither purely imperative nor purely functional approaches have succeed in motivating a generalized approach to generative programming that works for all software application problem domains and addresses high-level usability issues.

The Aurora Platform proposes a new hybrid model that employs both functional and imperative programming metaphors. At a high-level of abstraction a CSDL circuit specification is a functional program specification that is ultimately transformed by Protein into an executable built up from imperatively programmed processor plug-ins.

To introduce Aurora’s hybrid generative programming model, the next section explores the hardware-based signal generation technologies that are semantically emulated by Aurora when it is applied to solve generative programming problems.

Hardware-Based Signal Generator Analogy

Hardware-based signal generators are combinations of analog and digital circuitry designed to transform some set of inputs, into some set of outputs that constitutes the generated signal. Hardware signal generators are used extensively in electrical test equipment and other signal processing application such as communications systems, control systems, remote sensing…

At a high level of abstraction, such electrical circuits are conceptually similar to generative programming systems; their inputs are effectively a specification of limiting constraints, their internal functionality encapsulates specific expert domain knowledge, and their output signals are generated according to algorithms determined by the combination of expert domain knowledge and a specification of limiting constraints. To the extent that the hardware implementation of these systems is immutable, hardware signal generators are more akin to software compilers than generative programming systems. See the note in the Goals of Generative Programming section above for an explanation of this distinction.

Now suppose the existence of a hardware signal generation system that is implemented not in terms of rigid interconnections between fixed-function analog and digital circuit components, but rather is implemented in terms of mutable interconnection topologies, and a combination of socketed and programmable circuit components. Such a system is could easily be designed using digital cross-bar switches, physical socket fixtures, and devices such as Field Programmable Gate Arrays (FPGA) and microprocessors.

Conceptually, a mutable hardware-based signal generation system as described above starts to look and feel a lot like a generative programming system. This system is no longer a metaphorical compiler because the domain knowledge it encapsulates is exposed and can be changed by reconfiguring cross-bar switches, selecting specific components for insertion into sockets, updating FPGA equations, and modifying microprocessor programs.

The analogous relationship between the mutable hardware-based signal generator architecture described above and the Aurora platform should be immediately obvious given previous explanations.

Aurora’s Generative Programming Model

Generative programming applications are constructed using Aurora by semantically configuring the platform to integrate and execute software-based emulations of mutable, hardware-based signal generator architectures.

Domain knowledge is expressed in terms of a set of CSDL module and bus specifications, limiting constraints are expressed in terms of CSDL circuit specifications that are visualized and edited using the Fiat graphical user interface runtime service, and integration and runtime execution of the emulation is provided by the Protein runtime service.

This is made clearer by understanding that the output “signal” of our software-emulated signal generator is typically a collection of files that together comprise a wholly integrated customized software application that are passed to an external compiler, cross-compiler, or interpreter depending on the target platform, and implementation language.

Earlier we learned that Protein transforms a CSDL circuit specification into a software application by building an interconnected network of low-level binary processor plug-ins that it instantiates and subsequently executes within a virtual machine. In the previous section on Generic Application Infrastructure, we employed Protein to actually orchestrate the integration and execution of a final target software application.

When the Aurora platform is used for generative programming applications, the software application resident in Protein’s virtual machine is a generator, or code synthesizer that we metaphorically think of as a software-based emulation of a signal generator system. Executing this emulation generates the source code for our final target software application. Be clear that this extra step in the overall process is not a function of the Aurora platform but rather is a function of the semantics of the low-level binary processor plug-ins, and the CSDL processor and module specifications that reference them.

To make this somewhat less abstract, consider that a generator is effectively a set of expressions that evaluates to set of program source code files (the synthesized output of the generator).  Thus:

  • CSDL circuit specifications determine the set of expressions that a generator implements.
  • CSDL module specifications (from which CSDL circuit specifications are composed) are effectively re-usable subexpressions.
  • CSDL processor specifications (from which CSDL module specifications are composed) play the role of expression operands and operators in a generator.

To bring this point home, consider the user dragging and dropping tiles in the Fiat user interface runtime. Recalling that Fiat’s tiles are visualizations of CSDL modules and that interconnected tiles are visualizations of CSDL circuits, it should now be clear that the user is actually composing expressions from re-usable subexpressions.

Example Problem Domains

Embedded Systems Software Generation

Nearly every electronic device sold today is at least partially dependent on programmable digital circuit components. Vendors of semiconductor processors, processor cores, chipsets, and subsystems must compete rigorously for design wins on the basis of price, performance, and tools support. Any differentiator that tips the balance for these vendors has an immense impact on the number and size of design wins. There is a lot of money at stake in this market and differentiators are hard to come by.

A substantial cost involved in developing any new electronic device that relies on programmable digital components is the development of embedded software. A maxim of embedded systems engineering is that, depending on the project, anywhere from 60-80% of development resources are devoted to software development. Hardware and software design are equally complex and demanding tasks. Why then this disparity?

The answer is that the hardware engineers utilize development tools and methodologies far superior to the ones employed by software engineers. Additionally, hardware engineers are typically supported by their component vendors much more efficiently; component vendors supply CAD models, cell libraries, and detailed reference design schematics…

Major component vendors support software engineers with traditional embedded systems tools such as assemblers, linkers, cross-compilers, simulators, and emulators. These are essential, but not sufficient. To an embedded systems developer, getting the job done still requires a lot of reading and poking around through data sheets, and mountains of vendor supplied application source code samples that are typically so trivialized as to be little more than starting point in the overall design process. Smaller component vendors typically rely on third-party tool vendors to support their components, and the smallest vendors have no software tools support at all.

In order to make life easier for the embedded software engineers, all of the expert domain knowledge related to software development needs to be packaged in a way that is easily transferred and readily useable. This is where the Aurora platform can help.

Using Aurora’s generative programming model, the vendor can create an integration kit that encapsulates all the low-level details of device programming taken from data sheets, all the disparate application source code examples, all the tips and tricks of the vendor’s application and support staff. Bundled with Aurora, this integration kit transfer this knowledge succinctly to the embedded software engineer in much the same way that vendors currently support hardware designers with models and schematics.

To the OEM, this represents shortened design cycles and reduced development costs – factors that add significantly to the overall value proposition of a hardware/subsystem vendor who, in order to maintain a reasonable profit margin, must constantly fight to keep their product(s) elevated above commodity status.

Distributed Web Application Generation

Hybrid Applications

[ under construction ]

Usage Models

The Aurora platform has two main constituencies: those who use it to model and build software, and those who customize its semantics via its extension mechanisms. The architecture of Aurora suggests a usage metaphor for each of these constituencies decidedly different than what people are used to. To put this in some perspective, we briefly digress into a philosophical discussion of the factors motivating Aurora’s architecture.

Historically, new programming languages have been developed to address the immediately obvious deficiencies of current programming languages; object oriented languages in response to procedural languages, procedural languages in response to assembly language, assembly language in response to machine code, machine code in response to building new hardware.

Attacking the problem of programmer productivity incrementally in this fashion has unarguably led to great advances in programming languages. However, the software engineering methodologies and development tools associated with each new generation of programming language have always been a remnant of this incremental evolution rooted in machine code programming. It’s no wonder that we find the methodologies and tools in use today deficient – they were never conceived to leverage what human beings find natural and intuitive.

By contrast, the Aurora platform was conceived by first considering how best to leverage what human being find natural and intuitive, and then working backwards to establish a bridge to what is currently possible to implement using the current generation of programming languages. Traditionally, this expanse has been considered too wide to span. However, marked advances over the past decade in object oriented programming languages and high-order techniques of abstraction such as template-based generic programming have sufficiently narrowed the gap to the point where building this bridge is now possible.

The result is not a programming language. Rather it is specification methodology and runtime service architecture for transforming specification into source code. Applying this methodology dictates specific usage metaphors that do not share their roots with other software development tools that were specifically designed to model the constructs of current programming languages. Stated more plainly; the Aurora platform is designed to solve the problems inherent to software modeling and code generation at a much higher level of abstraction than current development tools and it is therefore used differently.

ISV/OEM Usage Model

[ under construction ]

End-User Usage Model

 [ under construction ]

Product Differentiation

[ under construction ]

Status and Availability

[ under construction ]

Technology Transfer

[ under construction ]

Core Technology Licensing

[ under construction ]

Consulting Services and Support

[ under construction ]

Further Information

White Paper Updates

The most recent revision of this white paper is posted at: http://www.encapsule.com/go.htm?go=aurora-whitepapers

Annotated CSDL Examples

Additional information on the CSDL specification, including several annotated examples, is posted at:


Comments and Feedback

Thank you for investing the time to read this paper. It would be very much appreciated if you additionally take the time to comment on what you have read so that this paper can be adjusted to serve as broad a readership as is possible given the subject matter.

Your insight is extremely valuable. Please e-mail feedback, questions, comments, and criticisms to comments@encapsule.com. Messages received will be read carefully and responded to directly if possible.

Additional Technical Data

Complete technical specifications, demo applications, and example source code is available from Encapsule Systems, Inc. under non-disclosure agreement. Please contact Encapsule Systems for further information.

Inquiries and Demonstrations

To learn more about the Encapsule Aurora platform, or to arrange a live demonstration of the Aurora technology, please contact:

Current (2012)

http://www.encapsule.org<– under construction as of May, 2012

http://blog.encapsule.org <– this website

http://twitter.com/#!/encapsule (@encapsule)

About Encapsule Systems

Encapsule Systems, Inc. is a software research and development company that designs and markets advanced component software modeling and code generation tools based on the Aurora platform technologies.

Work to define the CSDL specification and write a reference implementation of the Aurora platform runtime services is an effort that dates back to a series of research experiments began in 1997. Based on promising results from these early research experiments, efforts to bootstrap Encapsule Systems began in earnest in early 2001. In 2002, Encapsule Systems incorporated in the state of Maine, USA as a C-type corporation. Currently the company is held and financed privately.