Intermetrics, Inc. - RED Rationale - March 1979 - Capsules and Separate Translation

.	Navigate

5. Procedures & Functions

7. Exception Handling & Multitasking

CAPSULES &
SEPARATE TRANSLATION

Use of Capsules     Introduction     Exporting from a Capsule     Abstract Data Types     Own Data     Libraries and Compools
Separate Translation     The Translation Unit     Extensions of the Scope Concept
Environment Specifications     Use of Environments     Implementation Issues of Separate Translation

6.   CAPSULES AND SEPARATE TRANSLATION

6.1  USE OF CAPSULES

6.1.1  INTRODUCTION

The capsule feature of RED, in accordance with Steelman (SM 3.5), is not simply a facility for defining abstract data types, but a general mechanism for controlling scopes of names. Among the encapsulation mechanisms provided by recent languages, capsules are most like modules in Modula [Wi77]. In the languages CLU [LSA77], Alphard [WLS76], Simula [DNM69], and Euclid [LHLMP76], the encapsulation mechanism defines a new type. In both RED capsules and Modula modules there is no particular association between the mechanism and abstract types; rather, a set of related definitions can be defined, some of which are exported for general use. RED differs from Modula primarily in the way that hiding of underlying properties for abstract A data types is accomplished, and in the way that the lifetime of definitions in a capsule is defined.
RED capsules can be used to define a single abstract data type, a group of related abstract data types, a group of related procedures (e.g., an arithmetic package) and a group of related data (a compool). In addition, the capsule serves as the unit of separate translation. In this section we discuss uses of the capsule mechanism and some related technical issues. Separate translation is treated below, in Section 6.2, including a discussion of the EXTERNAL feature.

6.1.2  EXPORTING FROM A CAPSULE

The capsule is a mechanism for controlling scope of names. Only those definitions that are explicitly exported are available for use outside the capsule. All definitions in the capsule are potential candidates for exporting except for labels. Prohibiting the exporting of labels is consistent with RED's prohibition on non-local goto's. Ordinarily definitions will be exported by listing their names explicitly in the export list; variables can be exported readgnl; if desired. For some uses, e.g., libraries and compools, all definitions are intended to be exported, which can be accomplished by saying "EXPORTS ALL".
In Modula, the lifetime of the definitions in a module is the lifetime of the block that contains the module declaration. In RED, lifetime may be determined not by where a capsule is defined, but by where the definitions it provides are useful. These definitions are made available by exposing the capsule; e.g., "EXPOSE ALL FROM NEW M". The lifetime of the definitions is the lifetime of the scope to which the exposure is local, and the exported definitions are local to that scope. Since it is explicitly exposed, a capsule can have parameters that permit it to be specialized to the using environment. These parameters, for example, can be used to determine the size of an internal table, or to initialize some data of the capsule. The capsule may not export its parameters; this rule avoids a source of aliasing.

6.1.3  ABSTRACT DATA TYPES

An abstract data type is defined by writing a capsule that exports the type and the operations that determine the abstract behavior. For example, to define the abstract type "intstack", one can define the following capsule:

Inside the capsule, the abstract type is defined in terms of an underlying type, and the abstract operations are implemented in terms of that underlying type. In addition, internal, helping routines can be defined; these are not visible externally because only the exported operations are available for outside use.
The abstract type is defined using a TYPE declaration (not an ABBREV declaration); for example:

Use of a TYPE declaration makes the abstract type a new type distinct from all other types. In addition, the abstract type can have attributes (e.g., "size" above), and attribute inquiry is automatically available for external use if the type is exported.
A type declaration provides a few operations for the newly defined type; in particular, .ALL, :=, =, and selectors if the underlying type is a composite type. For example, intstack has operations .ALL, :=, =, .top, and .els. These operations can be used in implementing the operations of the abstract type. However, with the exception of := and possibly =, they should not be exported. The .ALL, .top, and .els operators are not abstract operations; i.e., they make no sense as far as the abstract type is concerned. Furthermore, they reveal information about the underlying type. Sometimes even := does not make sense as an abstract operation, in which case, it is not exported.
Occasionally, the definitions of := and = that are automatically provided for a new type are not the correct ones for the abstract type. This is expecially likely when the underlying type is indirect, since := for indirect types introduces sharing, and = tests pointer equality. The definer of the new type can override the automatic definitions of := by providing explicit definitions of := and = for the new type. The issue of user defined assignment is discussed further in Section 9.1.
In contrast with the approach taken in Modula, the semantics of an exported type do not change when it crosses a capsule boundary. In Modula a type becomes "opaque" as it crosses the module boundaries; i.e., knowledge of its underlying type is lost. In RED, the type is the same both inside and outside the capsule, but some operations available inside may not be available outside because they were not exported.
A capsule can be used to define several abstract types. This is useful when operations (such as conversion functions) need access to the underlying properties of two or more abstract types. Capsules can be nested in other capsules. This provides the ability to define two or more types with underlying properties hidden from one another, but with special access rights with respect to one another. For example:

Operation q3 for t2 is only available for use inside capsule def; it is known in capsule deft1 and may be used in defining t1. Such an ability would be useful, for example, in a capsule controlling the use of a group of abstract printers, where the individual abstract printers can be used outside (i.e., written), but where creation of new abstract printers can only be done by an operation on the group as a whole.
Another use of nested capsules is to define layers of abstract machines, where some definitions provided by a lower level machine are available to a user of a higher level machine, and others are not.
As shown in the following example, a capsule may be used to define a single instance of an abstract object rather than to define a new abstract type:

This capsule defines a single object, which is represented by the data internal to the capsule (the variable named hashtable). The size of hashtable is determined at the time the capsule is EXPOSEd, at which time the body of the capsule (the REPEAT statement) initializes hashtable. The advantage of such a capsule is that there is no need to pass hashtable as an actual parameter to the operations.

6.1.4  OWN DATA

Capsules can have "own" data; i.e., local variables which are not exported. The own data can be initialized by the capsule body when the capsule is exposed; the initialization can be based on the values of the capsule parameters. One illustration of this is the symbol_table capsule in the previous section. As another example:

Note that random is declared to be an abnormal function, because different invocations with the same actual parameter values (in this case, none) are intended to provide different results.
Another use of own data is to collect statistics about the use of a capsule. Consider the case of two capsules that both define the same abstract data type, but one collects statistics about usage. It is desirable that both capsules appear the same to the using programs: it should be possible to collect information about the use of a module, provided the behavior of the module doesn't change in any other way, without affecting using modules. To satisfy this goal, functions must be permitted to modify data local to their capsule. Such an ability is required by Steelman (SM 4C) and is provided by RED. Note that such modifications are "invisible" side effects, and will not cause a function to become abnormal. See also Section 5.2.

6.1.5  LIBRARIES AND COMPOOLS

Capsules can be used to provide libraries of useful procedures and data types. For example, a capsule might provide a library of mathematical functions (e.g., various integration methods) and related data types (e.g., complex numbers). Capsules can also be used to provide compools of related data items.. The body of the capsule can initialize the data, based on the capsule parameters if desired. Capsule parameters can also be used to control the sizes of the data.

6.2  SEPARATE TRANSLATION

Block structure, or nested name scopes, is one of the major contributions of ALGOL-60. Almost all of its successors have adopted this approach to name scoping. In recent years, the need to hide declarations has become clear and has generated interest in encapsulating some declarations so that they are not as generally visible as the scope rules would otherwise have made them (capsules perform this function in RED). Nonetheless, the basic concept of names being defined in outer scopes and used in statically enclosed inner scopes remains the best way of sharing declarations of both data and program objects.
The ALGOL-60 rules require that a name's declaration be visible (according to the scope rules) at any point where the name is referenced. A direct consequence of this rule is that everything must be translated together. In the world of embedded applications, such an all-or-nothing philosophy is unacceptable:

Embedded applications often involve tens of thousands or hundreds of thousands of lines of code. Retranslating the entire program for each change is not economically feasible.

Large applications are typically built by large groups of people. It is an organizational impossibility to have them all working on parts of the same source text and yet the partial and total integration of their components must be easily achieved. This integration must be performed without invalidating any component testing performed before integration.

Since translators, linkage editors, and other support software can have bugs, final verification must be performed on the embedded object code, not on the source. If the whole program is retranslated, the great bulk of the code must be retested even though no source level changes were made. Such retesting is unacceptably costly in both money and time.

Programs may contain classified information (e.g., encryption/decryption routines). The invoker of a routine has no need to know about the internals of the routine.

A language and associated translators for supporting embedded applications must provide a separate translation capability. This capability must sacrifice none of the convenience, reliability, efficiency, or access control obtainable by a single unified translation scheme. Furthermore, the separate translation facility must support the integration of components produced by diverse sources, including libraries with conflicting name usages and programs written in other languages. RED satisfies all these requirements.

6.2.1  THE TRANSLATION UNIT

The unit of translation is the capsule. The capsule is ideally suited for this role as it can, with equal ease, be used to define:

a shared pool of data;

a shared routine or library of shared routines;

an abstract data type or library of such types;

an integrated module.

Capsules have the essential properties which facilitate program modularization:

they can be combined with other capsules to form a larger, more powerful capsule;

they can completely control what subset of their capabilities is exported;

they can be written to provide some collection of virtual machine facilities (e.g., abstract data types) independent of their environment.

Elaboration of the main capsules of a system is initiated by some extra-lingual facility (e.g., the operating system, the computer operator, or the power up sequencing hardware). A capsule which is not a main capsule must be referenced from some other capsule in order to be useful. In order for a capsule (or its exported objects) to be referenced, its name must be visible. When two capsules are translated together, name visibility is provided by the standard scope rules. When the capsules are translated separately, the scope rules must be extended to provide the required visibility.

6.2.2  EXTENSIONS OF THE SCOPE CONCEPT

Extending the scope rules can be done in several different ways. The FORTRAN approach has the virtue of simplicity. External routines are declared implicitly by calling them. External data are declared explicitly via COMMON declarations. All bindings are performed by a linker, independently of the language. There is no attempt to perform any type checks. In fact, type mismatches in COMMON declarations are legal and often intentional. Since this approach is clearly incompatible with the Steelman and RED philosophy, it is perhaps interesting to consider the logically opposite extreme.
The system implementation language LIS [CII76] is a strongly typed block-structured language supporting separate translation. The LIS philosophy is that an application is a single (perhaps large) block structured unit. It is possible to translate this whole object or to separately translate any properly nested subpart. This philosophy has a great deal of intuitive appeal. Unfortunately, when all the details are worked out, there are problems. Since the underlying idea is simply to use block structure to provide name access, one would expect there to be few or no rules pertaining specifically to separate translation. In fact, there are a large number of complicated rules. These rules are so complicated that an otherwise very favorable analysis of LIS [KL78] concluded that its separate translation facility would have to be simplified before the language was usable in a production environment. In spite of the language's intentions, an unacceptably large part of the total application must be retranslated whenever a structural change is made (e.g., adding a new routine).

6.2.3  ENVIRONMENT SPECIFICATIONS

The RED language provides a simple mechanism satisfying the requirements for separate translation. Associated with each separately translated capsule is a descriptor containing, among other things, a "template" of the capsule. The template (described in more detail in Section 6.2.5) is automatically generated by the translator when the capsule is translated. The environment of the compilation is determined by which EXTERNAL capsules are exposed. The environment determines which templates are to be visible when the unit is translated. By controlling the use of EXTERNAL environment capsules, project management can completely control access rights between separately translated capsules.

6.2.4  USE OF ENVIRONMENTS

Example 1

Many capsules import nothing. This is particularly true of library capsules. Such an empty environment is created by not exposing any EXTERNAL capsules.

Example 2

Most large projects will define a library of utility routines available to any capsule. Such a library might include the trigonometric functions, some abstract mathematical types (e.g., complex, matrix, vector), and some simple arithmetic functions (e.g., ceiling, floor, absolute value). If each of these groups is provided through a separate capsule, an appropriate environment might be provided by:
      EXPOSE ALL FROM EXTERNAL trig;
      EXPOSE ALL FROM EXTERNAL mathtype;
      EXPOSE ALL FROM EXTERNAL arithfunc;
The separate translation facility provides the effect of a global scope containing all of the separately translated capsules. Whether a capsule is defined locally or separately, it still must be exposed before it is used. Occasionally, several capsules each need their own instance of a separately translated capsule. To implement this, each capsule performs an EXPOSE selecting the NEW option. However, in our example it is more likely that all the capsules should share one instance of the library capsules.
Not selecting the NEW option has the effect of performing an EXPOSE in the scope containing the translation unit. Data are allocated and initialized in that scope and routine names are made available. The capsule being translated may use the routine names and import the data using exactly the same mechanisms as it would if it were embedded in a larger capsule actually containing appropriate expose declarations.

Example 3

The library capsules of Example 2 probably only export routine names. Embedded applications frequently require pools of data to be shared among several separately translated units. For instance, the guidance, navigation, and control software of an airplane might use a shared pool of data defined in capsule gncpool, as well as the library environment. The appropriate environment would be provided by:
      EXPOSE ALL FROM EXTERNAL gncpool;
      EXPOSE ALL FROM EXTERNAL trig;
      EXPOSE ALL FROM EXTERNAL mathtype;
      EXPOSE ALL FROM EXTERNAL arithfunc;

Example 4

When independently produced libraries are combined, name conflicts may arise. For instance, both trig and arithfuncs might provide square root routines. If both capsules are exposed in the same scope, a name conflict will arise. RED offers two solutions to this problem.
It is possible that each square root routine is needed. For instance, one might be faster and the other more accurate. In that case, one or both routines must be renamed. For instance:
      EXPOSE ALL FROM EXTERNAL trig RENAMING sqrt TO fastsqrt;
      EXPOSE ALL FROM EXTERNAL arithfuncs RENAMING sqrt TO accsqrt;
If the two routines are redundant, the desired alternative is to eliminate one of the square root routines. This is done by limiting the access to one of the capsules:
      EXPOSE sin, cos, tan, arctan FROM EXTERNAL trig;

Example 5

The main purpose of the visible list in a capsule invocation is to restrict a particular translation unit's power rather than to resolve name conflicts. Suppose a mission contains a capsule for logging events. That capsule might export routines for reading the date and time of day, for rewinding the log, for positioning the log, for modifying a log entry, and for appending an entry to the end of the log.
Most applications will require only a subset of these capabilities and project management may well wish to guarantee that the log is not accessed improperly. This may be accomplished through the definition of the following capsule:
      CAPSULE log_subset EXPORTS ALL;
      EXPOSE date, time, append FROM EXTERNAL log;
      END CAPSULE log_subset;
If the users are informed only of the existence of log_subset, then improper access to the log_capsule is prevented.

6.2.5  IMPLEMENTATION ISSUES OF SEPARATE TRANSLATION

The RED separate translation facility provides a convenient and flexible facility for developing a large program as a collection of separately translated modules. Of course separate translation raises several problems other than name visibility. These issues are addressed in the following paragraphs.

Version Skew

If two capsules are translated together, with capsule1 using resources exported by Gapsulez, the translator can guarantee that all interfacing between the capsules is performed correctly. If the two capsules are translated separately, then translating capsulez automatically generates a template. Capsulez must be EXPOSEd EXTERNAL in capsule1. The translator is led, by capsule1's EXPOSE, to read the template for capsulez and can thus once again guarantee that all templates are correct. The potential problem arises when capsulez is modified. In the unified translation case, translating capsulez implies also translating capsule] -- the consistency checks are made every time. In the separate translation case, the templates used in capsule1 are not rechecked. A mechanism must be provided which guarantees that all the template information in a given program is consistent. The various component modules of a complete program are brought together by the linker. The linker is thus the appropriate place to check for consistency and these consistency checks can be reduced to simply checking version numbers supplied by the translator.
Whenever a capsule is translated, the translator generates a new template. This template is compared to the old one. If the two are identical, no changes are made. If there is any difference, the new template is installed in the descriptor and the version number is incremented. Whenever a descriptor is used as part of an environment, the version number of the imported template is recorded as part of the descriptor of the capsule being translated. At link time, the version numbers of the capsules being linked must match the version numbers specified in the importer's descriptors. If a mismatch occurs, the linker reports a version skew error. Some implementations may ease this problem by generating lists of capsules with template mismatches or automatically retranslating importers whenever a template is changed. Such facilities are convenience factors -- the linker check is all that is necessary.

Interfacing to Foreign Code

The template part of a capsule's descriptor contains all the information necessary to use the capsule. The information required is implementation dependent. A routine's template might simply specify the types and binding classes of its parameters. On the other hand, highly optimizing translators may use different parameter passing conventions depending on user specified optimization criteria and the use of a particular parameter. If there are options, the particular conventions to be used must be specified. When linking to programs written in another language the RED translator must know the parameter passing conventions. An implementation is free to choose whatever mechanism seems appropriate. If there are only a few other high level languages involved, the natural scheme would be to incorporate knowledge about the other language processors' conventions into the RED translator. If many hand crafted templates must be met, a template will have to specify detailed calling sequence information. In any case, this is information provided by project management in the template. The fact that a capsule is built out of foreign code is nowhere reflected in the RED source code. If the foreign program is subsequently rewritten in RED, all that is required is a retranslation of the callers.
The RED translator must be able to write a template description. By restricting programmers' access to other template writing programs, project management can effectively enforce control over the use of foreign code.

Inline and Generic Routines

The template part of a capsule's descriptor must contain all the information necessary to translate uses of any exported object. For routines, this information is typically limited to the types and binding classes of the parameters. Details about the inner workings of the routine are unnecessary because they are contained in the routine's object code and the linker will make the necessary connection between the invocation and the routine.
Routines which are to be expanded in—line or which have generic parameters cannot be completely translated at their point of declaration. Only at the point of invocation does enough information become available to complete the translation of the routine. This implies that the template must contain a translatable representation of the routine. The form of representation is totally implementation dependent. Certainly, the complete source text of the capsule's declarations and the body of the routine are sufficient. The appropriate division of labor between declaration time processing and invocation time processing depends upon the expected number and complexity of invocations.

Mutual Recursion

Although the capability is not required by Steelman, there are many cases where it is appropriate to separately translate mutually recursive routines. For instance, in a translator using a recursive descent parser, almost all routines are mutually recursive. Nonetheless, the translator can be logically broken up into modules.
The separate translation of capsules containing mutually recursive routines poses no problems to the RED programmer. The key is to first provide appropriate templates, either by translating capsules containing stubs for the mutually recursive routines or by using the global translate facility (described below). Development of the routines can then continue independently with changes in one of the routines occasionally requiring translation of the other capsule, just as in the non-recursive situation. It is possible that making massive internal changes, such as replacing a stub by the actual routine, may modify a template. This is particularly likely in optimizing translators. When such a template change occurs, the other capsules must be retranslated to reflect the changes. This process is inherently stable and converges in a small number (usually 1) of iterations.

Efficiency

Efficiency is a major concern in embedded applications. Although separate translations and convenient library facilities are attractive, it is essential that no run-time price be paid for them. The two possible sources of inefficiency are 1) degradation of the quality of the object code and 2) the loading of unused routines. RED suffers from neither of these problems.
It should be kept in mind that some of the best object code now produced is produced by FORTRAN translators which translate every routine independently of all others. The present state of the art in optimization does not avail itself of information about a routine's calls. In fact, highly optimizing translators usually go to great lengths to translate routines before translating the callers so that information about the routine's operation can be used to optimize the caller. A RED translator can put any useful information about a routine into the template, thus enabling it to perform exactly the same optimizations as in the unified translation. For generic and inline routines the required amount of information is certainly large. This information is required, however, simply to support the separate translation of these routines. No additional information is required to support optimization. It is likely that in the near future improved optimization techniques will be developed that use information about the caller. For instance, practical algorithms for automatically deciding upon inline vs. out-of-line expansion may be developed. The GLOBAL TRANSLATE facility described below supports even this extreme .level of optimization at some additional translation-time expense.
Not infrequently, changes are made in several separately translated but interrelated capsules. Retranslating the capsules separately has several disadvantages:

if templates change, care must be taken to translate the capsules in the right order;

substantial calendar time and staff time may be wasted because some translation requests cannot be submitted until others have been completed;

substantial computer time is wasted because the translator must redo substantial analysis for each separate module.

The GLOBAL TRANSLATE facility of a RED translator enables the user to translate any subset (or all) of the capsules in a library. The translations are performed as if all the translated capsules were enclosed in one large capsule exporting all of the individual capsules, but several advantages accrue:

template information for capsules being translated is taken from the current translation, thereby eliminating the ordering problems;

templates need to be interpreted only once regardless of the number of capsules requiring them;

the performance of the translator can be optimized by eliminating multiple loadings of the same phases;

the object code of routines can be optimized using knowledge about the caller.

In present-day systems, routines are not loaded unless they are required. This is equally true in RED. Although a capsule will often contain several routines, the translator can generate a separate object module for each routine. When the linker binds object modules into a load module, it need only link in object modules which are called by some already required routine. If a capsule contains a library of routines, only those routines actually required by a particular application are linked in. Once a routine has been linked in, subsequent requests for the same routine are always linked to the already linked copy. It is likely that a library will contain several instantiations of a generic routine; nonetheless, the linker will only link in the instantiations actually used.

5. Procedures and Functions

7. Exception Handling & Multitasking

Overview
Requirements
      Strawman
      Woodenman
      Tinman
      Ironman
      Steelman
RED Reference
RED Rationale
Types in RED
Time/Life Computer Languages
Memories
Site Index

Overview Reference ToC Rationale ToC Site Index