The project aims to develop formal models for such heterogeneous systems, a software environment for the design of such systems, and synthesis technologies for implementation of such systems. In the latter category, we are concentrating on problems not already addressed well elsewhere, such as the synthesis of embedded software and the partitioning and scheduling of heterogeneous parallel systems.
The focus of our work on formal methods is to understand models of computation that can be applied to system-level design of embedded signal processing systems. The major focus, therefore, is on concurrent models and models that coexist well with huge computational loads and real-time constraints.
In what could be a significant breakthrough, we have followed up on a suggestion by Gerard Berry of INRIA to develop a semantic model of discrete-event systems (such as that used in VHDL, Verilog, and the discrete-event domain in Ptolemy). This model provides a complete metric space for signals in such systems, thereby enabling the use of standard, well-established mathematical methods (most notably the Banach fixed point theorem) to study issues such as determinacy. A paper is forthcoming.
We have organized a new graduate class, EE290N, "Specification and Modeling of Reactive Real-Time Systems." This class incorporates recent results obtained under this project, and may become a regular graduate class. The description of the class follows:
"This research seminar studies models of computation and programming language semantics used for the specification and modeling of real-time and reactive electronic systems. It begins with a review of the theory of partially ordered sets, particularly as applied to prefix orders and Scott orders. It develops a framework for models of computation for concurrent systems that uses partially ordered tags associated with events. Discrete-event models, synchronous/reactive languages, and dataflow models are studied in this context. Basic issues of Turing completeness and lambda computability, boundedness, determinacy, reachability, and liveness are studied, with emphasis on decidability and efficiency of verification and synthesis algorithms. Classes of functions over partial orders, including continuous, monotonic, stable, and sequential functions are considered. A hierarchy of increasingly specialized asynchronous models, including process networks, Kahn process networks, dataflow process networks, the Boolean dataflow model, and the synchronous dataflow are covered. Timed models, including discrete-event systems (as embodied for example in the VHDL and Verilog languages) and the synchronous/reactive languages Signal, Lustre, Esterel, and Statecharts are studied. Throughout, applications to signal processing, real-time, and reactive systems are emphasized, as are synthesis and compilation techniques amenable to such modern approaches as embedded system design, hardware/software codesign and formal verification."
More information about this class can be found at: http://www.eecs.berkeley.edu/~eal/ee290n/
Signal processing systems perform intensive numeric computation, but they typically also have sophisticated control functionality for sequencing the computation tasks, switching among operation modes, coordination, and configuration. Dataflow models are suitable for describing numeric computations. The finite-state machine (FSM) is an intuitive model for describing control with a formal, well-studied mathematical theory. But the basic FSM model, which is flat and sequential, is not suitable for describing complex concurrent control. A common solution to this problem is hierarchical FSMs, which extend the basic FSM model with hierarchy and concurrency. The Statecharts visual formalism is an example of this approach.
We observe that FSM semantics, hierarchy, and concurrency are orthogonal semantic properties of Statecharts. If we take away from Statecharts the transitions that cross hierarchy boundaries, we get a simpler model in which FSM semantics can be cleanly separated from concurrency semantics. This means that the basic FSM model can be mixed with the various concurrency models to get many models that are only slightly weaker than Statecharts. We call this new computational model "*charts", where the "*" is a wildcard representing various possible concurrency models. We have created a preliminary implementation of this model in Ptolemy. Systems can be built by hierarchically nesting FSMs and concurrency models. The synchronous dataflow model is particularly attractive because when it is combined hierarchically with FSMs in certain ways, the combination is far more expressive than either SDF or FSMs alone, even though the resulting system remains finite state. Verification, synthesis, and optimization questions all remain decidable. We have developed a preliminary visual editor for state transition diagrams, which is integrated into the Ptolemy GUI so that a user can seamlessly traverse a hierarchical design that combines FSMs with dataflow block diagrams. At present we can simulate such a mixed-model system description. We plan to add the capability to generate code from such systems.
The synchronous/reactive model of computation is popular (mostly in Europe) for the design of real-time embedded systems. Examples of languages that use this model are Esterel, Lustre, Signal, and Argos. A key property of the model is that events in concurrent modules are totally ordered with respect to one another. This means that any two events are either simultaneous, or one unambiguously precedes the other. This contrasts the dataflow approach, where events are partially ordered. A second key property of SR languages is that simultaneous events are defined by a fixed-point equation. Fixed-point theory guarantees the existence of a least fixed point under certain technical conditions.
SR languages have been used in control-intensive, safety-critical embedded system design such as aircraft and nuclear power-plant control. Their formal properties ensure determinacy and bounded memory, and enable extensive verification. They appear to be an attractive model for certain kinds of signal processing systems.
We have constructed an SR domain in Ptolemy that differs from standard SR languages by allowing modules to be designed in some foreign model of computation. This is consistent with the "hierarchical heterogeneity" principle of Ptolemy. This domain has a number of practical and theoretical challenges that result from this heterogeneity. In particular, the information-hiding principle used in Ptolemy occludes certain important information about modules that is normally exploited in compiling these languages. We have had to adapt the theory and compilation techniques to avoid violating this information hiding.
We have developed a dynamic execution policy for the SR domain in Ptolemy and proved that it always converges to the minimal fixed point. We have a theoretical bound on the number of steps required to reach this fixed point (order N^2, where N is the number of actors in the graph) and have been developing heuristics that fall well below this bound.
We have prototyped C++ and Tcl interfaces to the dynamic higher-order functions mechanism, in which we dynamically switch in a replacement block. This can be used to implement hierarchical state machines (with no cross-hierarchy state transitions), and dynamically evaluated higher-order functions. For example, we can implement conditionals (like if-then-else) within a dataflow actor as a HOF by using the C++ interface.
In the area of system-level design, we have improved the synthesis of VHDL from dataflow graphs and developed a VHDL synthesis mechanism that translates dataflow block diagrams into one of two different styles of VHDL code, one style optimized for synthesis and one for simulation. We have also developed a model of computation for multidimensional multirate signal processing.
We have developed a method for the partitioning of a single application specified in synchronous dataflow (SDF) into multiple independently-synthesizable, communicating VHDL hardware modules. Either self-timed (asynchronous) or fully-static (synchronous) hardware implementations are allowed, and the clock timing and control are automatically generated. We have shown that this method guarantees the preservation of correct functional behavior as specified in the original SDF graph, and that many choices of partitioning into multiple hardware modules are possible. The ability to break up a larger application into smaller synthesizable hardware modules can lead to efficiencies in hardware synthesis, which is faster when performed on smaller VHDL specifications. At the same time, the communication between the multiple modules is sufficiently specified by the method so as to ensure that the correct functional behavior is preserved when the separate modules are executed concurrently. Previous work has been done for partitioning SDF graphs onto heterogeneous simulation engines and DSP processors, including non-synthesized sequential VHDL code for simulation.
Work has continued in the development of a new VHDL domain for Ptolemy. This effort has progressed on three fronts. The first is the use of Design Compiler from Synopsys to synthesize VHDL code generated from dataflow graphs in Ptolemy into netlist form. There is a basic ability to control the code generation in order to affect the quality of the synthesis result. The second advance has come in the ability to generate VHDL code from dataflow graphs for efficient simulation in VHDL, and to pass that code from Ptolemy to simulators from Synopsys and Model Technology. This enables VHDL simulation from designs generated in Ptolemy.
The third area of progress is the most innovative (and most speculative). Using our hierarchical scheduling framework we were able to get VHDL simulations to interact with Ptolemy simulations in the SDF domain (synchronous dataflow) and, more interestingly, to interact with synthesized embedded software running in C on the host processor or in assembly code on a Motorola DSP. Their first demonstration system is an analysis/synthesis filterbank where the signal stimulus and analysis of the results are done in the CGC domain (code generation in C), the analysis half of the filterbank is done on a Motorola DSP56002, and the synthesis half of the filterbank is done in the Synopsys VHDL simulator. Both the DSP and the VHDL simulator are running code generated by Ptolemy from dataflow graphs. We believe that this is a major milestone in heterogeneous system-level design.
Previously, the VHDL domain could generate either sequential VHDL code for simulation or structural VHDL code for synthesis, starting from the same dataflow graph. Now the domain has been extended so that the structural VHDL code can also be simulated. The structural VHDL code naturally simulates more slowly than the sequential VHDL code because the structural VHDL code communicates data values over VHDL signals, which is less efficient for simulation than using only VHDL variables as in the sequential VHDL code case. However, the simulation of a structural VHDL model, which is also synthesizable and is closer to an actual implementation, increases the confidence level in validating by simulation.
We completed a demonstration of a scalable beamforming application in the retargetable VHDL domain. This uses higher-order functions to control the number of sensors. The application also has multiple sample rates. When the code generator is set to generate sequential VHDL, simulations run reasonably quickly.
Just as was already the case with sequential VHDL code, the structural VHDL code now simulates under two modes. The first is as a standalone simulation of a design described entirely in VHDL. Data can be visualized using standard Ptolemy graphing capabilities. The second mode is generating structural VHDL code within Jose Pino's CompileCGSubsystems target, where a portion of a heterogeneous system is implemented in structural VHDL code and the overall system is co-simulated (with, for example, C code on the workstation or DSP56000 code on an s56x card on the workstation bus). In this second case, the Xgraph stars and other nonsynthesizable stars can be moved out of VHDL and into C so that the VHDL subsystem remains synthesizable.
We have outlined the design of a tool for mapping a control and dataflow representation of a hard real-time signal processing application onto a Mercury RACEway multicomputing system, and are continuing development. Low-level programming details would be hidden from the programmer thereby shifting the design focus to performance issues. Graphical visualization and manipulation capabilities will enable study of architectural tradeoffs and optimizations. Tasks can be scheduled and partitioned among the processors either manually or automatically. The tool will handle most of the details involved in generating multiprocessor code, downloading the code to the target system, and initializing the system for execution. Additional capabilities of the tool will allow extension of the default routine library by the programmer and will allow interfacing with other hardware synthesis and codesign tools. The tool will be implemented as a code generation target in Ptolemy.
We have developed a preliminary "retargeting tool" to be used to guide migration of Ptolemy-based designs from one implementation technology to another. We have demonstrated an interface for studying differences in block libraries, and shown how it could be used to make the code generators for the Motorola DSP56000 family processors and the Texas Instruments C50 family processors more compatible. We have also developed a program that recursively changes the domain of hierarchical designs. The Ptolemy 0.6 Web page (http://ptolemy.eecs.berkeley.edu/pt0.6dist.html) contains a patch that installs this program into the Ptolemy 0.6 sources.
We have developed and released a set of heuristic search techniques written in the Mathematica programming language. They implemented breadth-first search, depth-first search, hill climbing, and simulated annealing techniques for applying a set of equivalence relationships to an algebraic expression to minimize implementation cost. One goal is to use the heuristic searches to apply the equivalence relationships in the Signal Processing Packages for Mathematica to optimize the implementations of Ptolemy systems. Both the Heuristic Search Packages and the Signal Processing Packages are available on the Ptolemy Web site.
We have rethought code generation wormholes in the context of a new hierarchical scheduling framework. The objective is to more effectively mix synthesized software and VHDL models with simulations built in other Ptolemy domains. The problem with the previous version is that it would respect the user-specified hierarchy, thereby making it hard for a system to have multiple Wormholes into the same target. For example, if we have simulation-SDF on the outside, it used to be impossible to have two non-directly connected Wormholes into a single S56XTarget (this is a target that integrates with a SparcStation an S-bus card with a Motorola DSP56000 and a Xilinx FPGA).
The idea is to have the inter-SDF wormholes flattened, just as they are currently done in heterogeneous multiprocessor targets. Then we use the hierarchical multiprocessor scheduler to schedule the system. We treat the entire hierarchy as a multiprocessor CG target. A pair of virtual methods will return a pair of send/receive actors which will usually provide a generic C interface. For optimality, they may also be specifically tuned to support communication within a particular target or between a pair of targets. In the case that this system is being mixed with simulation-SDF, clusters are created around the code generation components. The scheduler attempts to build the largest possible CG clusters that meet the SDF Composition Theorem criteria, a conservative test that ensures that deadlocks are not introduced. For each of these clusters, an SDFStar will be created and dynamically linked in as is currently done with the CGCTargetWH and S56XTargetWH targets.
The default clustering classes do not use wormholes. We believe these are only necessary when interfacing schedulers that were not designed to directly interact. Our new hierarchical scheduler, by contrast, is designed to manage the interface between different schedulers.
We have completed a new loop scheduler in Ptolemy that works on acyclic SDF (synchronous dataflow) graphs and constructs single appearance schedules that are optimized for buffer memory consumption. The algorithms and heuristics it implements are all described in our new book: "Software Synthesis from Dataflow Graphs", Kluwer 1996. It invokes both the RPMC and APGAN heuristics and picks the better schedule. The scheduler makes extensive use of the Cluster hierarchy which was only being used in our hierarchical scheduler until now.
In collaboration with Shuvra Bhattacharyya (now of Hitachi America), we have developed a new, general, latency constrained resynchronization (LCR) algorithm for 2 processor systems that is capable of handling delays. This algorithm is provably optimal, and the proof is relatively simple.
The Ptolemy software serves as both a laboratory for experimentation and a mechanism for disseminating results. During this reporting period, we have made a large number of enhacements to the software, and have completed one major release, numbered 0.6, completed on April 12, 1996. Below is a summary of the major enhancements to the software.
The Ptolemy 0.6 release consists of approximately 3000 files containing 400,000 lines and 9 Mb of source code (compressed). See our World Wide Web server http://ptolemy.eecs.berkeley.edu or finger ptolemy@ptolemy.eecs.berkeley.edu for complete information.
The documentation for Ptolemy0.6 is available in PostScript, HTML and PDF, together with updated summary sheets, answers to frequency asked questions, a quick tour, and a tutorial. We have set up a Usenet read news group called comp.soft-sys.ptolemy. Postings to our mailing list ptolemy-hackers@ptolemy.eecs.berkeley.edu are cross-posted to the comp.soft-sys.ptolemy and visa-versa. Postings to the news group and the e-mail group are searchable from our World Wide Web site.
Tycho is written primarily in Itcl, also called [incr Tcl], developed by by Michael McLennan of AT&T. Itcl is an object-oriented extension of Tcl, a "tool command language" written by John Ousterhout of U.C. Berkeley. The window toolkit Tk and its object-oriented extension Itk are also used extensively.
The Tycho0.1 release is the first public release of Tycho as a
stand-alone system. It runs under the vanilla Itcl 2.0 and 2.1 with
no changes to the executable. It also runs under the Ptolemy system,
a design environment distributed by the same group that developed
Tycho.
Tycho0.1 can be obtained from
http://ptolemy.eecs.berkeley.edu/tycho/Tycho.html
The key objectives of the Tycho project are:
One of the key principles in Tycho is that anything can have a hyperlink to anything else. Documentation will have links to source code, and vice versa. Visual editors will have links to textual editors. And specialized displays can be created for any form of data. These displays, of course, are also connected by hyperlinks.
The interface to Ptolemy kernel will eventually be entirely through ptcl, the Tcl extensions that provide an interpreted command language for the Ptolemy kernel. An interim mechanism is provided where Tycho forms a subsystem within the much older visual editor for Ptolemy called "pigi" (which stands for Ptolemy interactive graphical interface).
Tycho 0.1 was released with Ptolemy 0.6. It was still a very preliminary system, with some syntax-sensitive editors for Itcl, HTML, C, C++, and Ptlang (Ptolemy star) files. It has progressed considerably since then, and a future release is anticipated that will include at least the following features:
The policy in the Ptolemy project has been to release software with a very liberal copyright agreement and to widely disseminate the ideas embodied in the software through publications. This strategy has once again proven effective in getting research results quickly into the marketplace in the form of serious, supported products. On October 23, 1995, The Alta Group of Cadence Design Systems announced SPW 3.5, which contains three key technologies from Ptolemy: mixing of discrete-event and dataflow models of computation, and synchronous and dynamic dataflow scheduling technology.
The subtitle of Alta's press release is:
"New SPW* Simulation Technology for Convergence Applications Leverages Berkeley's Ptolemy Project Research"
In the body of the press release:
"The new simulation architecture is based on research from the renowned Ptolemy research project at the University of California at Berkeley. ... [It] utilized the Ptolemy team's results to uniquely implement Ptolemy's advanced simulation algorithms in Alta Group's leading SPW solution."We expect to continue to work with Cadence and others to ensure that the best results of the project make their way into self-sustaining commercial products. The full press release is available on the Ptolemy Web site.
Lockheed-Martin, Sanders division, has been using Ptolemy to develop tools for architectural evaluation and tradeoff analysis. Their work leverages the SDF and DE domains in Ptolemy, enhanced with their own user interface and visualization tools. Motivated by our interactions with them, we have begun a serious effort to ease the process of building and releasing third-party extensions to Ptolemy. Rather than having an ad hoc series of extensions with incompatible ways of integrating into Ptolemy, we are working on a more uniform solution.
The group of Prof. Alberto Sangiovanni-Vincentelli at Berkeley has released a Ptolemy-based co-design environment for control-dominated embedded systems, called POLIS. POLIS offers an integrated interactive environment for specification, co-simulation, formal verification, and synthesis of embedded systems implemented as a mix of hardware and software components. It uses and significantly extends the discrete-event (DE) domain in Ptolemy. See: http://www-cad.eecs.berkeley.edu/Respep/Research/hsc/abstract.html