• No results found

3.1.1 Requirements for schema development


Academic year: 2023

Share "3.1.1 Requirements for schema development"


Loading.... (view fulltext now)

Full text


Chapter 3

Schema Modelling and Development 1

The definition of the schema for a particular domain is vital to the success of computerised projects in the A/E/C arena, whether it be the smallest application or the largest ISO standard. In all except the smallest of projects, schema development is carried out by a group of domain experts, each contributing to the part of the schema representing their areas of expertise. Development of the schema needs to be coordinated by one or more individuals who must ensure that the current schema version is passed through to all developers and that all input to the final schema is considered and acted upon in some systematic manner. This introduces a requirement for coordinated management and documentation of the schema development.

3.1 Introduction

The whole schema development process, unless closely managed, is open to sources of error and conflict. The current version of a schema must be propagated to all developers so that they are all considering the same schema. Modifications that are made from one version of a schema to the next need to be documented so that differences between versions are easy to identify (especially for very large schemas). Change requests and additions to schemas sent by developers all need to be documented and the action taken on them recorded so that developers know that their suggestions have been heard, and, if rejected, they know the reasoning behind the rejection.

Multiple conflicting requirements between developers need to be negotiated to a final settlement which satisfies the majority of developers. Electronic management and design tools have the potential to solve all of these problems.


However, existing software tools to help users through this process are limited in the scope of the problem that they can tackle (for example, see Vogel 1991; Luijten 1992; Poyet et al. 1990; Boyle and Watson 1993). Some tools assist in the development of a schema, while others check it for consistency, translate it to an implementation language, or allow an instance of a model to be perused. These tools do not tackle the problems introduced by multiple developers, and, where there are multiple tools to handle the process, they do not allow the use all of the tools together in an interactive and non-deterministic manner. To solve these problems, the author has designed an integrated modelling and development environment which provides many functions to users in a homogenous environment.

In this chapter the requirements for a modelling environment in a large modelling project are introduced. The EXPRESS Programming Environment (EPE), which tackles these requirements, is detailed, and its ability to meet the design requirements of a modelling environment is demonstrated.

3.1.1 Requirements for schema development

While the introduction above concentrates on requirements for the development of an IDM, many other schemas also need to be modelled in the development of an integrated design system, as discussed in Section 2.2 of this thesis. The development of DT and user schemas are tasks which can be of a similar size and complexity to the development of the IDM. They can require a similar number of domain experts to implement, and produce the same set of problems as discussed for the IDM schema development. Developing schemas for DTs and users introduces an added dimension to the management problems specified for the IDM as there is an association between these schemas and the IDM. Therefore, changes in these schemas (particularly in the user schemas) must be matched to the portions of the IDM that were influenced by the original schema specification.

In any large scale multi-partner schema development project there are a number of modelling support issues which need to be dealt with. For example, COMBINE included the following:

Communication and integration of DT schemas: the development of the building schema in COMBINE is based on a cyclic development process where DT teams supply the data view of their DT in the form of an EXPRESS schema (referred to as an aspect model). These DT schemas are taken as input by the central IDM development team for constructing a common integrated schema. This process employs a combination of top-down structuring and bottom-up expansion of the growing IDM. Resulting drafts of the IDM are subsequently distributed to the DT teams to inspect and refine. This process of iterative refinement may continue for many months and during this time all partners concerned in this process must be kept up to date with the changing schemas.


Documentation: during the schema generation process, the communication of schemas and schema constraints involves an intensive interaction between various teams in the project. All decisions and justifications for decisions must be permanently documented as they are made.

Mapping definitions: it is through the definition of the IDM to DT schema mappings that the consistency and adequacy of the IDM is totally checked by its clients (those who will use the data in the IDM). The mappings determined by the DT schema teams need to be defined formally, as these mappings describe the required interface between the DT and the IDM.

Support for multiple modelling paradigms: the modelling paradigms EXPRESS, EXPRESS-G, NIAM, IDEF1X and IDEF0 are well suited to different types of modelling. An ideal environment for COMBINE would support the specification of a schema using multiple paradigms.

Support for multiple views: the integration process is very hard to accomplish systematically, let alone automate (the required "meta modelling knowledge" is lacking), so it remains a tedious and non-deterministic process to combine multiple views into one coherent whole.

To support this integration process the modelling environment should provide methods to enable views to be integrated into a central schema and to document the source of entities and structures in the integrated schema.

To support these requirements an ideal modelling support environment (MSE) needs mechanisms for: easy communication of schemas; generation and manipulation of multiple views of a schema in a variety of formalisms; annotation of the schema with documentation; and flexible update management to support iterative updates of the schema. Other important features would be support for the definition of inter-schema mappings and for integration of schemas, although it is recognised that, owing to the strong creative element in these tasks, automated tools are not, as yet, achievable. Another important issue is the ability to rapidly prototype a resulting operational system. MSEs should provide facilities to allow instantiation of schemas to be tested in a run time environment, e.g., to test mapping specifications and enhance the understanding of the underlying schema.

3.1.2 Schema specification languages

There is a plethora of languages available for modelling schemas. While there are overlaps between the expressive ability of these languages, there are also many differences in what can be expressed, in some cases due to the domain the language was developed for and in others due to the structure of the language. In the development of schemas the most commonly used languages fall into one of the following three categories: relational database schema definition, such as ER (Chen 1976), NIAM (Nijssen and Halpin 1989), IDEF1X (General Electric 1985); object-oriented schema definition, such as EER (Gogolla 1994), EXPRESS (ISO/TC184 1992); and data flow definition, such as DFD (Stevens et al. 1974), IDEF0 (Mayer 1990). All common modelling languages have graphical notations for defining schemas, and some languages also have a textual



In the ISO-STEP development there has been great pressure to use a single language in the development of schemas for the standard. Existing languages were felt not to be powerful or complete enough for the requirements of the STEP standard, leading to a decision to develop a custom modelling language. The resultant language, EXPRESS, and its graphical notation, EXPRESS-G, provide a similar coverage to that offered by other object-oriented specification languages, although without all of the standard object-oriented notions, such as class method definition. This lack is being rectified in the latest version, EXPRESS-V2.

Due to the adoption of EXPRESS as the standard language in the development of STEP many developers and integrated design system researchers have taken on EXPRESS as their modelling language to maintain easy compatibility with the results of the STEP standard. The problems they feel that they avoid through the use of EXPRESS are mainly the difficulties of moving schema information from one modelling language to another without loss of information in the transfer. In some cases these problems arise due to a mismatch between the modelling capabilities of the different languages, but in others the problem is due solely to poorly written translators. Due to this, EXPRESS is being used in a large number of projects and standardisation efforts faced with large modelling tasks in domains covered by the STEP standard (for example, ISO/TC184 1993;

Gielingh and Suhm 1992; ATLAS 1993). Environments for modelling schemas, both commercial and research developments, are discussed in Section

3.1.3 The EXPRESS and EXPRESS-G languages

A brief introduction to the EXPRESS and EXPRESS-G languages is provided here, to allow readers unfamiliar with the notation to understand the figures presented in this chapter. Many aspects of the EXPRESS-G notation are not covered in the following example as they do not appear in any of the diagrams in this chapter. A full specification of EXPRESS and EXPRESS-G can be found in the ISO standard (ISO/TC184 1992). For those interested in comparisons between graphical notations, the diagram in Figure 3.1 provides the EXPRESS-G view of the same example in Snart graphical notation presented in Figure 1.3.

Figure 3.1 shows four classes (abstract_parent, child1, child2, and related_class). A class (called an ENTITY in EXPRESS) is represented by a solid-lined rectangle with no other graphical embellishments. The name of the class appears inside the rectangle, and can be preceded by the keyword (ABS) to denote an abstract class (see the class abstract_parent). Attributes and relationships of a class are shown by single thickness lines drawn between a class and the type of the attribute or relationship. These lines are labelled with the attribute or relationship name and a small circle attaches the line to the type definition. For example, the class child2 has attributes att1, att2, and att3 of type string, real, and integer respectively. The class child2 also has a relationship to the class related_class through the named relationship rel_name. EXPRESS-G also allows


aggregating types to be denoted in the diagram, as can be seen for the class child1. The attributes att1, att2, and att3 are of aggregating types set, bag, and list respectively (denoted by a S, B, or L after the attribute name). As can be seen in Figure 3.1, these aggregated types can also show their bounds, if they are specified. The attribute att1 has no specified bounds on the set, att2 has a lower bound of 1 for the bag but no upper bound and the list for att3 is constrained to be between 1 and 10 values. Inheritance between classes is shown with a thick line connecting two classes. This line terminates with a circle at the connection with the child class. In Figure 3.1 child1 and child2 both inherit from abstract_parent. Complex inheritance relationships can be defined in EXPRESS though these are not shown in any of the diagrams in this chapter. However, due to this potential complexity, the EXPRESS class descriptions always show both supertypes and subtypes for each class.

Figure 3.1 An example of the EXPRESS-G notation

3.2 Schema Development in the EPE Environment

The EXPRESS Programming Environment (EPE) has been engineered to support the modelling requirements detailed above for schema evolution and management. In EPE this equates to multiple graphical and textual views of varying degrees of complexity at different stages of the project. During analysis, simple graphical views, which embody high level concepts and relationships between them, are mapped out and manipulated. During early design, these simple graphical representations are fleshed out: constraints specified; attributes of entities added; and inheritance hierarchies fully specified. During late design, more detailed information becomes


available which is often best manipulated in free form textual representations of portions of the schema. During implementation, the developed schema is compiled, checked for syntax errors, and detailed models are loaded and checked for consistency. Throughout the iteration of these stages and during maintenance, modifications can be made at any of the levels described above and must be propagated to all dependent stages. EPE provides integrated support for each of these activities, using the MViews consistency mechanism to provide the required inter-view consistency.

3.2.1 Functionality offered by the EPE environment

In this section a description is provided of the EPE tools and view types available at each stage of development, and the methods of keeping views consistent with one another is described. These are illustrated using an example from the initial IDM of COMBINE. The COMBINE IDM comprises around 400 entities and 600 relationships; only a small portion, associated with technical systems, is shown here. The placement of technical systems in relationship to the building class in the IDM is shown in Figure 3.2.

Figure 3.2 Use of technical_system in the COMBINE IDM Analysis

Figure 3.3 shows the types of graphical views commonly used at the analysis stage. Two analysis views show portions of the inheritance tree for the technical_system entity (those entities dealing with ventilation and air conditioning) specified using EXPRESS-G (thick lines represent


inheritance links in EXPRESS-G). Each view is constructed by direct manipulation using tools selected from a tool palette, to the left of the view.

The user is free to create as many views as is desired, and may freely lay out and populate each view, either with new information or with information entered into other views. The information in each view is mapped through to the canonical representation of the schema as the view data is entered, and any similarities or conflicts with the existing data are resolved as it is created. This ability to construct multiple views permits both general purpose and specialised views to be constructed. The former may be used to obtain an overview of the system under construction, the latter to focus on more detailed parts of the system. The proliferation of views means that navigation tools are needed to quickly access desired information. EPE provides inter-view navigation using both menu-based search facilities and automatically constructed hypertext links.

Figure 3.3 Two high-level inheritance specifications for technical_system Design

Attributes are often specified at the design stage, as shown for the technical_system example in Figure 3.4. This can be done, as shown in this figure, with all entities in one graphical view, or by using two or more views. For example, the attributes of basic types may be presented in a separate view to those which define relationships to other entities, thus adding clarity. Again, there is no limit to the number of design views that can be constructed, and the hypertext navigation facilities are available to navigate both between design views, and between analysis and design views. As in the analysis views, all information in the design view of Figure 3.4 is mapped back to the canonical representation of the schema and all dependent views made consistent with its contents.


Figure 3.4 Design stage specification of attributes of an entity Late design

At the late design stage more detailed information may need to be entered. This necessitates a textual representation of the entity in the EXPRESS language, as EXPRESS-G represents only a subset of what can be modelled in EXPRESS. For example, UNIQUE clauses, WHERE clauses, rules, and type information have no EXPRESS-G representation. Figure 3.5 shows an EXPRESS textual view which has been generated from the canonical information on the entity. This view encompasses all information found in all the graphical views which define the technical_system entity, such as the information in Figures 3.2 and 3.3. This textual view is editable, with the user free to make changes to any parts of the textual description in the view. Modified textual views are parsed and compiled to ensure they represent valid EXPRESS descriptions, and their information passed back to the canonical representation.

Figure 3.5 Textual view derived from graphical views of an entity

(9) Consistency between views

When changes are made to an EPE view, other views that share information with the updated view may become inconsistent and must be updated to keep the schema consistent across all views. All views affected by a change are notified and, in many cases, are automatically modified to reflect the change. This consistency mechanism works both between views of one phase of development and between views of different development phases.

As an example, Figure 3.6 shows a modification being made to a graphical design view. The modification tightens a constraint on an entity's value; the lower bound on the number of values in the SET definition is now known to be 1 and is entered in the definition. The information entered in this manner is checked as to whether it is valid EXPRESS syntax before being accepted and allowed to modify the schema being developed. The change is propagated through to the canonical form of the schema which is updated, then all dependent views are identified and notified of the change which has been made.

EPE propagates the change to the other affected views in the form of an update_record (described in Section 3.2.2). This record provides a complete description of any single change. How views react upon receipt of an update_record depends on both the view type and the nature of the change.

In the design view the modification updates the graphical representation of the design view according to the definition of EXPRESS-G syntax, as can be observed in the graphical design view at the rear of Figure 3.7. The modification is not propagated through to the analysis views, as the modified attribute is not seen in these high level views. However, the attribute does appear in the textual late design view and it must be updated to be kept consistent.

Figure 3.6 Constraining the cardinality of an attribute at late design stage


Figure 3.7 The propagation of an update_record to a dependant textual view

In the EPE system update_records propagated to textual views are not applied automatically, although many of them could be. Instead the update_records are displayed in the view and the user has control over which updates are applied at which time. As can be seen in Figure 3.7, the graphical update to the technical_system entity generates an update_record in the entity's textual view. If the user instructs EPE to apply the update, the resulting view of Figure 3.8 is generated, where the notification of outstanding updates on this view has been removed, and the attribute definition has been automatically rewritten.

Figure 3.8 The automatic application of an update in a textual view


This strategy of allowing the user to apply updates to textual views exists to handle the problems caused by the specification of a constraint on an attribute or entity in EXPRESS-G. For example, when the user specifies that an attribute is constrained (see the attribute dialogue box in Figure 3.6) the user is denoting that the attribute takes part in either a unique clause or in a where rule.

However, there is no way in EXPRESS-G to specify which one the attribute takes part in, or in what form. Therefore, an update of this form can not be automatically applied to a textual view and must be manually implemented by the user. The change description thus serves to specify an inconsistency that requires manual resolution. Documentation

In addition to providing a consistency mechanism between views, update_records are retained in a persistent form in the EPE system. An update_record browser and editor gives the user the ability to browse the changes that have occurred to an entity in the evolution of the schema and add further documentation to each update_record. In this manner a portion of the documentation of the history of development of the system is automatically built up as work progresses. Having this update history on-line also allows system developers to trace back through previous design decisions while entities are further refined.

Figure 3.9 The persistent update_record viewer with documentation facility

Figure 3.9 shows the update_record browser displaying a list of changes that have been made to the technical_system entity based on the update_records generated by the changes in various views. They include a renaming of the entity, changing the entity from an abstract supertype to a normal entity and the cardinality constraint imposed on the SET declaration of an attribute. The full details of the modification to the attribute are displayed in the top window, highlighting the


comment field that can be filled in by the system developer.

Other documentation support in EPE includes the ability to create textual documentation views (accessible via the hypertext navigation facilities) for entities. In such views, the various experts working on a schema can document the reasoning behind decisions made and other information relevant to a particular entity and its attributes in a central and managed fashion. A useful feature of documentation views is that update_records relating to the entity are automatically added as textual comments to the view as the entity changes (see Figure 3.10 for a documentation view taken at an early stage of the schema specification).

Figure 3.10 A textual documentation view

Figure 3.11 The view navigator invoked for the technical_system entity View navigation

As can be seen from the preceding example, the EPE modelling environment captures a large amount of information about a schema. This includes many graphical and textual views specifying various properties of the schema, entities, and documentation of changes made to the schema. In large systems this can lead to problems in finding a particular view or detailed information that has


been entered. In the EPE environment the views and entities themselves act as a navigation and search facility for the project. Mouse clicks on graphical view components allow rapid access to other views containing that component. Figure 3.11 shows an example of this process for the technical_system entity. After clicking on the technical_system icon in the graphical view a list of all the graphical and textual views that index the technical_system is displayed and the user can navigate to these views in a hypertext-like fashion.

3.2.2 Using the EPE environment

The EPE system offers two types of display views to its users. The initial type of display view is a graphic view which can contain any EXPRESS-G specification of a schema (see Figure 3.2 for an example of a graphical view). In the graphical view the user has a palette of tools available, as seen on the left hand side of the view. These tools offer the following functionality.

This tool is used to specify an entity in a graphical view. Clicking in an empty portion of the drawing window after selecting this tool will bring up an entity dialogue requesting the name of the entity and its type (abstract or normal). If a new entity name is specified it will be added to the canonical representation of the schema. If the name of an existing entity is specified then the icon will be connected through to the canonical form of that entity. If an existing entity icon is selected while this tool is current then the name dialogue is retrieved, allowing the entity information to be modified (any modification will be seen in all views which reference the entity).

This tool is used to specify inheritance between entities in a graphical view. To specify an inheritance link the user clicks on an entity icon which is the super-class and drags to the entity icon of the entity which inherits from it.

This tool allows the specification of attributes for an entity. To describe an attribute the user clicks in the entity to which the attribute should be attached, and drags out to the position the attribute icon should occupy. This invokes an attribute dialogue to specify the name and type information for the attribute (an example of this dialogue is shown in Figure 3.6). If a new attribute name is specified for the entity then it is added to the canonical representation of the entity. If the name of an existing attribute is specified then the icon is connected through to the canonical form of the attribute. If an existing attribute is selected while this tool is current then the attribute dialogue is retrieved, allowing the attribute information to be modified (any modification is seen in all views which reference the attribute).

This tool is used to create a new graphical view of the specified entity. When an entity is selected the user is asked for the name of the view to create and a new graphical window with the entity will be created. If several entities and attributes are selected when this tool is used then the user has the option of copying all selected icons to the new window.


This tool hides an attribute icon, or an inheritance link, or an entity and all its attachments (i.e., attributes and inheritance connections). The hidden item is not removed from the canonical representation of the schema, merely hidden in the current graphical view.

This tool deletes an attribute, or an inheritance link, or an entity and all its attachments (i.e., attributes and inheritance connections). The items are removed from the canonical representation of the schema as well as from all views which contain the item.

This tool allows icons in the graphical view to be selected and repositioned in the current view. Double clicking on an attribute icon brings up the dialogue for that item. Double clicking on an entity icon has two functions depending upon where in the icon the double click occurs. Double clicking on the left hand side of the icon brings up a dialogue listing all views that this entity is specified in to allow navigation to other views (see Figure 3.11). Double clicking on the right hand side of the icon makes the textual view of the entity visible (see Figure 3.8).

The second type of view offered is a textual view. Textual views allow free-form textual editing and manipulation of the canonical definition of entities using EXPRESS notation. Textual views are re-parsed at the termination of a textual editing stage and all modifications propagated through to the canonical form of the edited entity.

3.2.3 Implementation of EPE in the MViews framework

EPE is constructed by further specialising SPE, a specialisation of the MViews object-oriented framework (Grundy 1993; Grundy and Hosking 1993a). This framework provides a set of abstractions for constructing software development environments that support multiple graphical and textual views with in-built consistency between views (Grundy and Hosking 1993b). SPE integrates, in a single environment, tools to assist in systems analysis, design, implementation and maintenance of programs in Snart, an object-oriented logic language (Grundy et al. 1994). Other environments developed using MViews include an ER (Entity-Relationship) modeller for the database domain, and a graphical forms builder for specifying form layout and semantics for GUI applications (Grundy and Hosking 1994).

MViews utilises a three-layered architecture to present and maintain multiple views of an underlying entity (see Figure 3.12). The base layer contains the canonical representation of the schema being developed in the MViews environment. This canonical representation takes the form of a directed graph of components, representing entities and attributes, connected by some set of relationships, representing generalisation and containment (in an entity). Views in the view layer provide a subset of the canonical representation in the base layer. A view represents the information required for a displayed view. Elements in a view need not have a one-to-one correspondence to elements in the base layer, rather a view relationship provides the connection


between view elements and base layer elements. These view relationships handle the mapping of data from the base layer through to the view layers and vice-versa. View relationships may aggregate base layer elements to form new elements in a view providing mappings between elements of any combination of arities. The display layer contains tools which visualise the elements provided by a single view. The types of tools that can be used in the display layer are:

graphical and textual editing tools which display and allow direct manipulation of elements in a view; or external tools which utilise a batch mode access to elements provided through a view.


External Layers

View Layers

Base Layer class

class generalisations

window drawing_window


text forms features


class_icon gen_glue

class_text feature_text

text drawing_window


external_class External Interface (Data/Event interchange)

External Tool



text forms

view rel. view rel. view rel. view rel. view rel.

Figure 3.12 MViews three-layer multiple view architecture as used in SPE (from Grundy 1994) In an MViews based system additions, modifications and deletions are initiated from the tools in the display layer. The actions initiated by these tools result in the creation of an update_record describing the particular action that was performed. The update_record created for an action is propagated from the display layer to its view in the view layer and down to the base layer following the links between the layers. From the base layer the update_record is propagated to all connected views to handle, as well as to all connected components in the base view and so onto their connected views, etc. At each stage in the propagation elements in the system can react to the update_record to modify their status or ignore the update_record. Individual elements can also propagate the update_record to all connected elements, or stop the propagation.

Figure 3.13 shows an example of the effect of an update in an MViews based application like EPE. In this example an action is performed by a tool in the display layer (1), an update_record is


propagated to all dependents of the class_icon (2), the view relationship translates the update_record into operations on the class component (3), which in turn generate update_records (4). These update_records are propagated through to all of the dependents of the class (5), the view relationships translate the update_records into operations on components in the view layers (6), and modified view components re-render themselves in the tools in the display layer (7).

View Layers

Base Layer class

class generalisations

features class_icon

class_icon gen_glue

class_text external_class

view rel. view rel. view rel. view rel.





5. 5.



6. 6.


7. 7.


Figure 3.13 Change propagation in an MViews environment (from Grundy 1994)

As EXPRESS is a partially object-oriented language it requires almost the same graphical modelling capabilities as the object-oriented language Snart requires in its integrated software development environment SPE. It was therefore a natural step to specialise the SPE environment (Grundy 1993, Grundy and Hosking 1993a) for use with EXPRESS. This specialisation was able to use most of the same class and feature representations, along with relationships between modelled objects. Figure 3.14 shows the inheritance hierarchy for the EPE environment, with all EPE classes prefixed with an ‘exp_’. This structure is very similar to that of SPE, and in fact the majority of the SPE code was able to be utilised for the EPE environment. The modifications to the SPE environment fell into the following categories.

• Rendering modifications: the visual appearance of EXPRESS-G information is quite different from that of Snart, even though their concepts are closely related. Therefore, all class and attribute representation code needed to be modified to allow the correct EXPRESS-G representation. A major part of this change was a set of additions to the SPE system to allow the EXPRESS-G attributes to be correctly handled. In Snart all attributes, except relationships, are shown inside the class icon. In EXPRESS-G all attributes are independent icons and therefore must be modelled through a relationship with the class representation.


Figure 3.14 Class inheritance for EPE from MViews

• Environment modifications: though the set of operations which can be performed on an EXPRESS-G diagram are similar to those in Snart, the language used to describe them, and the information requested from the user, had to be tailored to EXPRESS terminology.

This required changes to toolbox icons in the graphical views, and rewriting of all class and attribute dialogues.

• EXPRESS language parsing and generation: the underlying representation of information in the EPE system was in Snart form, as this allowed for compiling and testing of the defined schemas. This required all graphical and textual views to be translated into Snart, and the underlying canonical form to be translated from Snart to EXPRESS for textual and graphical view updates.

update_record labelling: the range of update_records used in SPE was examined and modified to suit the structure and terminology of EXPRESS.


• Class hierarchy management: since EXPRESS maintains both subclass and superclass links, as distinct from Snart which only specifies subtype relationships, class hierarchy management was extended to enable bidirectional tracking of updates to class hierarchies

• Removal of method handling: EXPRESS has no notion of methods associated with classes so all code for handling methods was stripped from the translated SPE system. Though EXPRESS allows function and procedure specification these are not handled in EPE.

3.2.4 Internal schema representation in EPE

In the EPE environment, an EXPRESS schema has an internal representation of its canonical form. EPE is based upon SPE, which utilises Snart as its base representation, so all EXPRESS constructs are translated into Snart definitions in the canonical form. As Snart is an implemented language this provides the ability to compile EXPRESS schemas and populate the resultant compiled schema with data to create models of the domain of the schema.

The amount of the EXPRESS language which is supported in this translation process is limited.

Snart is a full object-oriented language based around Prolog, while EXPRESS is a more imperative language with a style similar to C. All entities, inheritance structures, attributes, unique attribute specifications and some forms of WHERE clauses can be translated to Snart. All procedures and functions, including WHERE clauses which use procedures and functions, are not translated to Snart. In effect, this restriction means that a translation to Snart reduces the EXPRESS definitions in EPE to basic class interfaces with uniqueness and range checking on attribute values. While this is a major reduction in the expressiveness of the original language, this type of definition forms the major part of the principal IDMs that have been developed to date, and even covers the majority of schemas that have evolved from the STEP standard.

As EXPRESS is translated into Snart syntax in the canonical form, so must it be translated back into EXPRESS for use in textual views (see Figure 3.5) or for type definition in the attribute dialogues (see Figure 3.6). To achieve this the EPE environment contains two translators, one to translate from Snart to EXPRESS, and the other from EXPRESS to Snart. The EXPRESS to Snart translator is implemented through the use of the DCG system available in LPA Prolog (LPA 1995) and further described in Appendix F.1. The Snart to EXPRESS translator is totally hand- coded.

The development of the EXPRESS parser was based on the EXPRESS grammar in the ISO standard. However, when the completed parser was being tested, many of the existing ISO and EU project schemas failed to parse correctly. This is due to the many changes made to the EXPRESS language before it became a standard. As there were many changes to the syntax, the parsers developed alongside the EXPRESS language development have allowed for all previous forms of syntax to be treated as correct. This leads to a situation where, though there is now an ISO standard for EXPRESS, many modellers write incorrect models due to their recollection of


past specifications, and these syntax errors are not picked up by current parsers. Though the EPE parser may fail to read existing schemas, it does guarantee that all schemas developed in its environment are correct to the ISO standard.

As Snart is the internal language representing EXPRESS schemas, it is clear that any specification environment which represents its model in Snart form will be able to be incorporated fully into this framework. To this extent the SPE environment which allows the specification of Snart schemas can be used to model schemas which can be used interchangeably with the EXPRESS schemas from EPE described in this section. Modelling environments which do not produce EXPRESS or Snart schemas are not precluded from use in this framework, but they will not be able to be tested interactively during development with the schema development and browsing tools highlighted in Chapter 9.

3.2.5 Summary of EPE functionality

EPE offers a wide range of functionality for a schema designer. In a single environment the schema designer has the ability to model at many levels of detail, in both graphical and textual notations and with automated consistency management between all the views. The main functionality offered by EPE is:

• provision of multiple overlapping views of portions of a schema

• views ranging from high-level design views through to low level implementation views

• support for both graphical and textual notations

• changes to a schema in one view are seen in all other views containing the structures that were modified

• automatic maintenance of change documentation which can be augmented by the users

• support for schema navigation with hypertext-like links

3.3 Generic Schema Database Definition

Although a modelling environment must produce EXPRESS or Snart schemas to work fully inside this development framework, the mapping system (described in Chapters 5 and 10) makes no assumptions about the nature of the modelling environment or language that a schema was developed in. The only assumption made by the mapping system is that a schema type is of either relational or object-oriented form. To support schemas of either of these forms a generalised schema specification language (defined in Appendix G) is used to describe schemas developed at the modelling stage to the mapping system.

The schema specification language works at the atomic level of a schema description. Single modifications to a schema are recorded in the order they occur, to be able to reconstruct a schema


in the same progression as it was developed. The atomic definitions record the previous state of the schema as well as the change so that individual modifications may be undone to reverse the effect of a modification. Atomic changes are keyed to a version number for the schema and schema version information is defined as part of the database. Versions are represented as a directed acyclic graph with a single root node. Through the in-order application of all modifications that represent the chain of versions leading to the required version number, the corresponding schema can be created.

A schema defined in any notation which can be translated into the format used in the schema database definition can be used with a View Mapping Language (VML) mapping specification.

Schema databases are used during the VML parsing stage to create a checked schema mapping with all references validated and ready to be used by a mapping system. Schema databases can also be modified at the VML parsing stage to incorporate modifications (new entities and attributes) required by the mapping specification (see Chapter 5).

3.4 Appraisal of Schema Modelling

The EPE system and schema database format create an environment which supports the development, communication and integration of DT and IDM schemas as well as the documentation requirements of an integrated modelling environment, as detailed in Section 3.1.1 for COMBINE. EPE supports these requirements in the following manner:

Communication and integration of DT schemas: EPE allows the development of a schema of a particular domain with total consistency from the earliest stages of the project right through to the implementation. This consistency is maintained by communicating all changes through to all views which can possibly see the affected entities. Multiple DT schemas are supported in this environment, allowing the modeller to develop multiple graphical and textual representations of a domain during analysis and allowing these to be expanded at the detailed design stage with new or updated views. These developing schemas, and their views, are available to all who use the EPE system, allowing DT schemas to be easily shared between the integration team. Many of these developing DT schemas have overlapping information and need to be merged with the IDM schema. By developing multiple associated DT views and one canonical, consistent, representation of the IDM schema the consistency between the DT schemas and IDM can be maintained (by hand).

The schemas are also utilised to provide hypertext-like navigation facilities around the various views of related information.

Documentation: EPE meets many documentation needs by tracking all updates made to the schema and recording them against the entities that were modified. These update records can be annotated by the user to record justifications and decisions, and provide on-line documentation for the developing schema. One documentation feature which would be


useful, but is not provided, is the ability to group multiple disperse update records to represent a single update or documentation record. For example, this would record a session of changes as a single documented change to the schema.

Mapping definitions: these definitions are not considered in the EPE environment, but an environment to support mapping definitions has been developed and is described in Chapter 6.

Support for multiple modelling paradigms: EPE provides multiple paradigms as it supports both the EXPRESS and EXPRESS-G paradigms. Although one is the subset of the other, they render their information in different styles. The modelling environments developed and used in this project also show that through the use of a generic model of a schema (in this case in Snart format) multiple modelling tools, in this case EPE and SPE can be used to work on a single schema. Although Snart is not particularly generic as an underlying model for schemas, a schema representation language such as that used for the database representation of schemas would provide such a model on which most schema notations could be based. Other environments developed from the MViews platform have shown support for more disparate modelling paradigms through the use of an underlying representation which subsumes the paradigms used in the different display views (Grundy and Venable 1995 for OOA/D and EER; Venable and Grundy 1995 for ER and NIAM).

Support for multiple views: although the EPE environment can manipulate multiple independent schemas, there is no direct support for the specification of correspondences between schemas as an aid to the development of an integrated schema (i.e., schema integration for IDM development). Schema integration has previously been performed to a small extent (see Mugridge and Hosking 1995 for an example of DT schema model integration using SPE), but this does not illustrate the support that could potentially be offered in such an environment to perform integration. The difficult problem in supporting schema integration is the ability to define, track and support the specification of mapping definitions between various DT schemas in a tool like EPE. Chapter 5 details a first step towards an environment for defining mappings that relate two schemas together. However, this latter environment is not yet integrated with the environment that manages the schema, leaving a semantic gap between the maintenance of information in the various environments. This problem and the ability to manage concurrent schema development in the MViews, SPE and EPE environment will be an area of continued research (Amor and Hosking 1993;

Amor and Hosking 1995; Grundy and Venable 1995). A similar, though simpler, problem to that of supporting multiple views is supporting multiple versions of a schema. The schema database representation allows for the definition of versions of modelled schemas.

Any version defined in the schema can be reconstituted for use by any application utilising the schema database, which is of particular use when defining a mapping between versions of a schema. However, the versioning ability of the schema representation is not supported in the EPE environment which is the point at which this would be best specified and managed. Extensions to MViews to support versions would be relatively simple, especially


if the underlying representation was modified to utilise a schema database representation as described above.

Though EPE meets the majority of the requirements for a schema modelling and development environment, there are some areas which would benefit from further work. These include:

Collaborative design support: although large modelling projects may have a single coordinator in charge of schema development, there are likely to be several modellers working on different aspects of the schema. Coordinating the work of several modellers, and maintaining the consistency of the underlying schema under change from several sources concurrently, has to be a goal of any real schema development environment. This is currently an area of intense research work in the computer-supported collaborative working (CSCW) area. Recent research results in this area indicates that when these CSCW environments mature they will be easy to incorporate into existing modelling tools. The Serendipity system (Grundy et al. 1996) has already been shown with links to MViews- based environments and commercial Microsoft products, providing collaborative design between several modellers working through defined process models. Though the mapping implementation of Chapter 10 provides a transaction-based approach to coordinating between multiple modellers, this is not supported by version merging and conflict resolution systems, as can now be found in systems such as Serendipity.

Comprehensive multi-paradigm modelling support: in exactly the same manner that an integrated building design system requires an IDM to integrate design tools, a multi-paradigm modelling environment requires an IDM covering schema, mapping and project definitions in order to manage multiple paradigms in one environment. Though some work has been done in this area (Venable 1993) there are no IDMs which cover the wide range of schema modelling paradigms currently in use (e.g., ER, DFD, EER, NIAM, EXPRESS, EXPRESS-G, OOA/D, IDEF0, IDEF1X), let alone the underlying requirements of mapping and project definition languages.

Expanded modelling support concepts: the range of support concepts required by modellers when developing large schemas is not clearly understood, and hence, not well supported.

Currently documentation of schema modification and construction is made at the atomic level. Methods allowing for grouping of higher level concepts are imperative (including multiple viewpoints of change sets). The manner of notification of modifications in a collaborative environment needs attention to provide efficient methods of expressing these changes to other designers (rather than at the atomic change level), an example of developments in this area can be seen in Grundy et al. 1995. Navigation and summary features tend to be primitive. In large schemas with hundreds of entities and thousands of views, the set of views which reference a particular entity could be very large.

Classification of view types, or the relationship particular entities play in a view, could well help navigation around the schema and guide novice users through the various conceptual levels of schema specification. Notions of schema versions and private


workspaces need to be considered in a collaborative environment. This allows the management of multiple design paths, and for incomplete work to be hidden until fit to be used by other participants.

In summary, this chapter introduces the requirements for a schema modelling and development environment, and through the development of EPE and a schema database format, demonstrates that an environment meeting these requirements is possible. The development of individual schemas is, however, a small part in the development of an integrated design system. Schemas for design tools need to be integrated to form an IDM, or checked that they map into an existing IDM.

This process of schema integration, or checking, through the definition of mappings between schemas is described in Chapter 5. Schemas which represent design tools and actors are also required to define flows of control in a real project. The use of these schemas in project definition is described in Chapter 7.


Related documents

“We expect the number of farmers carting in this area to increase over the summer and supplies at this stage are at a good level to meet this demand.” Through the Community Water