1996 Annual RASSP Report

September 15, 1995 through September 15, 1996

SYSTEM-LEVEL DESIGN METHODOLOGY
FOR EMBEDDED SIGNAL PROCESSORS

Contract Number: F33615-93-C-1317
Principal Investigator: Edward A. Lee
Organization: University of California at Berkeley

Project Participants at Berkeley:

Raza Ahmed
Sunil Bhave
Wan-Teh Chang
Stephen Edwards
Brian L. Evans
Christopher Hylands
Ron Galicia
Alain Girault
Luis Gutierrez
Joel King
Bilung Lee
Edward A. Lee
Yu Kee Lim
Xiao Mei
David G. Messerschmitt
Siamak Modjtahedi
Praveen K. Murthy
Thomas M. Parks
John Reekie
José Luis Pino
Farhana Sheikh
S. Sriram
Matthew Tavis
Patrick J. Warner
Michael C. Williamson

1. Overview

The focus of this project is on design methodology for complex real-time systems where a variety of design methodologies and implementation technologies must be combined. Design methodologies are encapsulated in one or more models of computation, while implementation technologies are implemented as synthesis tools. Applications that use more than one model of computation and/or more than one synthesis tool are said to be heterogeneous. Hardware/software codesign is one example of heterogeneous design.

The project aims to develop formal models for such heterogeneous systems, a software environment for the design of such systems, and synthesis technologies for implementation of such systems. In the latter category, we are concentrating on problems not already addressed well elsewhere, such as the synthesis of embedded software and the partitioning and scheduling of heterogeneous parallel systems.

2. Details of Accomplishments

The major accomplishments of the project in this reporting period include the following topics, each of which is explained in more detail below. Refer to the Ptolemy web site, http://ptolemy.eecs.berkeley.edu/ for more information. All publications resulting from the project are posted on the web site.

Formal methods
Control and signal processing
System-level design
Scheduling and code generation
Ptolemy software
Tycho software
- Status of Tycho

2.1 Formal methods

The focus of our work on formal methods is to understand models of computation that can be applied to system-level design of embedded signal processing systems. The major focus, therefore, is on concurrent models and models that coexist well with huge computational loads and real-time constraints.

2.1.1 A semantic framework for comparing models of computation

In collaboration with Professor Alberto Sangiovanni-Vincentelli, we have developed a denotational framework (a "meta model") within which certain properties of models of computation can be understood and compared. It describes concurrent processes in general terms as sets of possible behaviors. Compositions of processes are given as intersections of their behaviors. The interaction between processes is through signals, which are collections of events. A system is determinate if given the constraints imposed by the inputs there are exactly one or exactly zero behaviors. Each event is a value-tag pair, where the tags can come from a partially ordered or totally ordered set. Timed models are where the set of tags is totally ordered. Synchronous events share the same tag, and synchronous signals contain events with the same set of tags. Synchronous systems contain synchronous signals. Strict causality (in timed systems) and continuity (in untimed systems) ensure determinacy under certain technical conditions. The framework is used to compare certain essential features of various models of computation, including Kahn process networks, dataflow, sequential processes, concurrent sequential processes with rendezvous, Petri nets, concrete data structures, and discrete-event systems. Details are reported in [20-22].

2.1.2 Semantics of discrete-event systems

In what could be a significant breakthrough, we have followed up on a suggestion by Gerard Berry of INRIA to develop a semantic model of discrete-event systems (such as that used in VHDL, Verilog, and the discrete-event domain in Ptolemy). This model provides a complete metric space for signals in such systems, thereby enabling the use of standard, well-established mathematical methods (most notably the Banach fixed point theorem) to study issues such as determinacy. A paper is forthcoming.

2.1.3 A new graduate class on modeling of systems

We have organized a new graduate class, EE290N, "Specification and Modeling of Reactive Real-Time Systems." This class incorporates recent results obtained under this project, and may become a regular graduate class. The description of the class follows:

"This research seminar studies models of computation and programming language semantics used for the specification and modeling of real-time and reactive electronic systems. It begins with a review of the theory of partially ordered sets, particularly as applied to prefix orders and Scott orders. It develops a framework for models of computation for concurrent systems that uses partially ordered tags associated with events. Discrete-event models, synchronous/reactive languages, and dataflow models are studied in this context. Basic issues of Turing completeness and lambda computability, boundedness, determinacy, reachability, and liveness are studied, with emphasis on decidability and efficiency of verification and synthesis algorithms. Classes of functions over partial orders, including continuous, monotonic, stable, and sequential functions are considered. A hierarchy of increasingly specialized asynchronous models, including process networks, Kahn process networks, dataflow process networks, the Boolean dataflow model, and the synchronous dataflow are covered. Timed models, including discrete-event systems (as embodied for example in the VHDL and Verilog languages) and the synchronous/reactive languages Signal, Lustre, Esterel, and Statecharts are studied. Throughout, applications to signal processing, real-time, and reactive systems are emphasized, as are synthesis and compilation techniques amenable to such modern approaches as embedded system design, hardware/software codesign and formal verification."

More information about this class can be found at: http://www.eecs.berkeley.edu/~eal/ee290n/

2.2 Control and signal processing

In previous years this project concentrated on the computational aspects of signal processing systems, and thus focused on models of computation such as dataflow that are particularly well suited. More recently, our attention has broadened to include control and sequential decision-making aspects of system design. We are pursuing three approaches for combining control-oriented computation with data-oriented computation: hierarchical concurrent finite-state machines, the synchronous/reactive model of computation, and dynamically evaluated higher-order functions.

2.2.1 The FSM domain

Signal processing systems perform intensive numeric computation, but they typically also have sophisticated control functionality for sequencing the computation tasks, switching among operation modes, coordination, and configuration. Dataflow models are suitable for describing numeric computations. The finite-state machine (FSM) is an intuitive model for describing control with a formal, well-studied mathematical theory. But the basic FSM model, which is flat and sequential, is not suitable for describing complex concurrent control. A common solution to this problem is hierarchical FSMs, which extend the basic FSM model with hierarchy and concurrency. The Statecharts visual formalism is an example of this approach.

We observe that FSM semantics, hierarchy, and concurrency are orthogonal semantic properties of Statecharts. If we take away from Statecharts the transitions that cross hierarchy boundaries, we get a simpler model in which FSM semantics can be cleanly separated from concurrency semantics. This means that the basic FSM model can be mixed with the various concurrency models to get many models that are only slightly weaker than Statecharts. We call this new computational model "*charts", where the "*" is a wildcard representing various possible concurrency models. We have created a preliminary implementation of this model in Ptolemy. Systems can be built by hierarchically nesting FSMs and concurrency models. The synchronous dataflow model is particularly attractive because when it is combined hierarchically with FSMs in certain ways, the combination is far more expressive than either SDF or FSMs alone, even though the resulting system remains finite state. Verification, synthesis, and optimization questions all remain decidable. We have developed a preliminary visual editor for state transition diagrams, which is integrated into the Ptolemy GUI so that a user can seamlessly traverse a hierarchical design that combines FSMs with dataflow block diagrams. At present we can simulate such a mixed-model system description. We plan to add the capability to generate code from such systems.

2.2.2 The SR domain

The synchronous/reactive model of computation is popular (mostly in Europe) for the design of real-time embedded systems. Examples of languages that use this model are Esterel, Lustre, Signal, and Argos. A key property of the model is that events in concurrent modules are totally ordered with respect to one another. This means that any two events are either simultaneous, or one unambiguously precedes the other. This contrasts the dataflow approach, where events are partially ordered. A second key property of SR languages is that simultaneous events are defined by a fixed-point equation. Fixed-point theory guarantees the existence of a least fixed point under certain technical conditions.

SR languages have been used in control-intensive, safety-critical embedded system design such as aircraft and nuclear power-plant control. Their formal properties ensure determinacy and bounded memory, and enable extensive verification. They appear to be an attractive model for certain kinds of signal processing systems.

We have constructed an SR domain in Ptolemy that differs from standard SR languages by allowing modules to be designed in some foreign model of computation. This is consistent with the "hierarchical heterogeneity" principle of Ptolemy. This domain has a number of practical and theoretical challenges that result from this heterogeneity. In particular, the information-hiding principle used in Ptolemy occludes certain important information about modules that is normally exploited in compiling these languages. We have had to adapt the theory and compilation techniques to avoid violating this information hiding.

We have developed a dynamic execution policy for the SR domain in Ptolemy and proved that it always converges to the minimal fixed point. We have a theoretical bound on the number of steps required to reach this fixed point (order N^2, where N is the number of actors in the graph) and have been developing heuristics that fall well below this bound.

2.2.3 Dynamically evaluated higher-order functions

We have prototyped C++ and Tcl interfaces to the dynamic higher-order functions mechanism, in which we dynamically switch in a replacement block. This can be used to implement hierarchical state machines (with no cross-hierarchy state transitions), and dynamically evaluated higher-order functions. For example, we can implement conditionals (like if-then-else) within a dataflow actor as a HOF by using the C++ interface.

2.3 System-level design

In the area of system-level design, we have improved the synthesis of VHDL from dataflow graphs and developed a VHDL synthesis mechanism that translates dataflow block diagrams into one of two different styles of VHDL code, one style optimized for synthesis and one for simulation. We have also developed a model of computation for multidimensional multirate signal processing.

2.3.1 Partitioning SDF applications into multiple VHDL hardware modules

We have developed a method for the partitioning of a single application specified in synchronous dataflow (SDF) into multiple independently-synthesizable, communicating VHDL hardware modules. Either self-timed (asynchronous) or fully-static (synchronous) hardware implementations are allowed, and the clock timing and control are automatically generated. We have shown that this method guarantees the preservation of correct functional behavior as specified in the original SDF graph, and that many choices of partitioning into multiple hardware modules are possible. The ability to break up a larger application into smaller synthesizable hardware modules can lead to efficiencies in hardware synthesis, which is faster when performed on smaller VHDL specifications. At the same time, the communication between the multiple modules is sufficiently specified by the method so as to ensure that the correct functional behavior is preserved when the separate modules are executed concurrently. Previous work has been done for partitioning SDF graphs onto heterogeneous simulation engines and DSP processors, including non-synthesized sequential VHDL code for simulation.

2.3.2 Synthesizing multiple styles of VHDL in Ptolemy

Work has continued in the development of a new VHDL domain for Ptolemy. This effort has progressed on three fronts. The first is the use of Design Compiler from Synopsys to synthesize VHDL code generated from dataflow graphs in Ptolemy into netlist form. There is a basic ability to control the code generation in order to affect the quality of the synthesis result. The second advance has come in the ability to generate VHDL code from dataflow graphs for efficient simulation in VHDL, and to pass that code from Ptolemy to simulators from Synopsys and Model Technology. This enables VHDL simulation from designs generated in Ptolemy.

The third area of progress is the most innovative (and most speculative). Using our hierarchical scheduling framework we were able to get VHDL simulations to interact with Ptolemy simulations in the SDF domain (synchronous dataflow) and, more interestingly, to interact with synthesized embedded software running in C on the host processor or in assembly code on a Motorola DSP. Their first demonstration system is an analysis/synthesis filterbank where the signal stimulus and analysis of the results are done in the CGC domain (code generation in C), the analysis half of the filterbank is done on a Motorola DSP56002, and the synthesis half of the filterbank is done in the Synopsys VHDL simulator. Both the DSP and the VHDL simulator are running code generated by Ptolemy from dataflow graphs. We believe that this is a major milestone in heterogeneous system-level design.

Previously, the VHDL domain could generate either sequential VHDL code for simulation or structural VHDL code for synthesis, starting from the same dataflow graph. Now the domain has been extended so that the structural VHDL code can also be simulated. The structural VHDL code naturally simulates more slowly than the sequential VHDL code because the structural VHDL code communicates data values over VHDL signals, which is less efficient for simulation than using only VHDL variables as in the sequential VHDL code case. However, the simulation of a structural VHDL model, which is also synthesizable and is closer to an actual implementation, increases the confidence level in validating by simulation.

We completed a demonstration of a scalable beamforming application in the retargetable VHDL domain. This uses higher-order functions to control the number of sensors. The application also has multiple sample rates. When the code generator is set to generate sequential VHDL, simulations run reasonably quickly.

Just as was already the case with sequential VHDL code, the structural VHDL code now simulates under two modes. The first is as a standalone simulation of a design described entirely in VHDL. Data can be visualized using standard Ptolemy graphing capabilities. The second mode is generating structural VHDL code within Jose Pino's CompileCGSubsystems target, where a portion of a heterogeneous system is implemented in structural VHDL code and the overall system is co-simulated (with, for example, C code on the workstation or DSP56000 code on an s56x card on the workstation bus). In this second case, the Xgraph stars and other nonsynthesizable stars can be moved out of VHDL and into C so that the VHDL subsystem remains synthesizable.

2.3.3 Mercury Raceway architecture

We have outlined the design of a tool for mapping a control and dataflow representation of a hard real-time signal processing application onto a Mercury RACEway multicomputing system, and are continuing development. Low-level programming details would be hidden from the programmer thereby shifting the design focus to performance issues. Graphical visualization and manipulation capabilities will enable study of architectural tradeoffs and optimizations. Tasks can be scheduled and partitioned among the processors either manually or automatically. The tool will handle most of the details involved in generating multiprocessor code, downloading the code to the target system, and initializing the system for execution. Additional capabilities of the tool will allow extension of the default routine library by the programmer and will allow interfacing with other hardware synthesis and codesign tools. The tool will be implemented as a code generation target in Ptolemy.

2.3.4 Guided migration: a retargeting tool

We have developed a preliminary "retargeting tool" to be used to guide migration of Ptolemy-based designs from one implementation technology to another. We have demonstrated an interface for studying differences in block libraries, and shown how it could be used to make the code generators for the Motorola DSP56000 family processors and the Texas Instruments C50 family processors more compatible. We have also developed a program that recursively changes the domain of hierarchical designs. The Ptolemy 0.6 Web page (http://ptolemy.eecs.berkeley.edu/pt0.6dist.html) contains a patch that installs this program into the Ptolemy 0.6 sources.

2.3.5 Automated rearrangement of signal processing systems

We have developed and released a set of heuristic search techniques written in the Mathematica programming language. They implemented breadth-first search, depth-first search, hill climbing, and simulated annealing techniques for applying a set of equivalence relationships to an algebraic expression to minimize implementation cost. One goal is to use the heuristic searches to apply the equivalence relationships in the Signal Processing Packages for Mathematica to optimize the implementations of Ptolemy systems. Both the Heuristic Search Packages and the Signal Processing Packages are available on the Ptolemy Web site.

2.4 Scheduling and code generation

2.4.1 Major improvements in heterogenous scheduling and code generation

We have rethought code generation wormholes in the context of a new hierarchical scheduling framework. The objective is to more effectively mix synthesized software and VHDL models with simulations built in other Ptolemy domains. The problem with the previous version is that it would respect the user-specified hierarchy, thereby making it hard for a system to have multiple Wormholes into the same target. For example, if we have simulation-SDF on the outside, it used to be impossible to have two non-directly connected Wormholes into a single S56XTarget (this is a target that integrates with a SparcStation an S-bus card with a Motorola DSP56000 and a Xilinx FPGA).

The idea is to have the inter-SDF wormholes flattened, just as they are currently done in heterogeneous multiprocessor targets. Then we use the hierarchical multiprocessor scheduler to schedule the system. We treat the entire hierarchy as a multiprocessor CG target. A pair of virtual methods will return a pair of send/receive actors which will usually provide a generic C interface. For optimality, they may also be specifically tuned to support communication within a particular target or between a pair of targets. In the case that this system is being mixed with simulation-SDF, clusters are created around the code generation components. The scheduler attempts to build the largest possible CG clusters that meet the SDF Composition Theorem criteria, a conservative test that ensures that deadlocks are not introduced. For each of these clusters, an SDFStar will be created and dynamically linked in as is currently done with the CGCTargetWH and S56XTargetWH targets.

The default clustering classes do not use wormholes. We believe these are only necessary when interfacing schedulers that were not designed to directly interact. Our new hierarchical scheduler, by contrast, is designed to manage the interface between different schedulers.

2.4.2 Scheduling of dataflow graphs

We have completed a new loop scheduler in Ptolemy that works on acyclic SDF (synchronous dataflow) graphs and constructs single appearance schedules that are optimized for buffer memory consumption. The algorithms and heuristics it implements are all described in our new book: "Software Synthesis from Dataflow Graphs", Kluwer 1996. It invokes both the RPMC and APGAN heuristics and picks the better schedule. The scheduler makes extensive use of the Cluster hierarchy which was only being used in our hierarchical scheduler until now.

In collaboration with Shuvra Bhattacharyya (now of Hitachi America), we have developed a new, general, latency constrained resynchronization (LCR) algorithm for 2 processor systems that is capable of handling delays. This algorithm is provably optimal, and the proof is relatively simple.

2.4.3 Multidimensional multirate signal processing

2.4.3.1 Modelling and semantics

In this work, we have concentrated on developing a dataflow model for expressing multidimensional multirate signal processing systems sampled on arbitrary lattices. A multidimensional signal can be sampled in many different ways. A straightforward extension of one-dimensional sampling results in the so-called rectangular sampling structure, where the samples lie on a rectangular grid. However, a more general sampling structure is a geometrical lattice; sampling lattices that are not rectangular can have many advantages in certain applications. For example, a signal sampled on a non-rectangular lattice can have a lower sampling density than one sampled on an equivalent rectangular lattice. For real-time processing of multidimensional signals, a lower sampling density means fewer samples to process in a given time interval. The standard MDSDF model suffers from the inability to model multidimensional systems sampled on arbitrary sampling lattices; hence, we give an extension of MDSDF that is capable of modeling such systems. The model we give preserves the property of static, compile-time schedulability. However, constructing such schedules requires the solution to some challenging problems. In particular, we show that an augmented set of balance equations have to be solved simultaneously in the extended model. The additional equations are quite different from the usual balance equations in SDF and MDSDF; they involve computing so-called "integer volumes" of parallelepipeds. This computation turns out to be an interesting number-theoretic problem, and we present several approaches for solving it. Finally, we present a practical example of a video sampling structure conversion system to show the usefulness of the generalized MDSDF model.

2.4.3.2 Filter design issues

The design of multidimensional multirate signal processing systems. e.g. systems that change video formats in non-separable ways, often require application-specific design tools. For example, much of the design work in computing system parameters in multidimensional multirate systems can be simplified with a combination of computational geometry, integer matrix algebra, and state-space formulations. In multiple dimensions, rate-changing operations are defined by a change in sampling grids. Sampling grids can be represented as a set of basis vectors, which can be considered as the column vectors that make up a sampling matrix. Mapping one sampling matrix onto another is a linear mapping represented by a rational matrix, called a resampling matrix. We have shown how to design two-dimensional rate changing systems (upsampler, filter, and downsampler in cascade) based on a geometric sketch of the passband to retain. From the sketched region, we use computational geometric techniques to find the minimal enclosing parallelogram using a linear time and linear space algorithm we have developed. We then use the minimal enclosing parallelogram to compute the resampling matrix to perform the sampling conversion using Chen and Vaidyanathan's approach. Then, we factor the resampling matrix into the upsampling and downsampling matrices for the rate changer. The procedure will find the best compression rate based on a parallelogram-shaped passband. The only other admissible geometry is a hexagonal-shaped passband, which will always do at least as well as the parallelogram-shaped passband. Generalizing this approach to multiple channels will enable the graphical design of two-dimensional filter banks and wavelets.

2.5 Ptolemy software

The Ptolemy software serves as both a laboratory for experimentation and a mechanism for disseminating results. During this reporting period, we have made a large number of enhacements to the software, and have completed one major release, numbered 0.6, completed on April 12, 1996. Below is a summary of the major enhancements to the software.

The Ptolemy 0.6 release consists of approximately 3000 files containing 400,000 lines and 9 Mb of source code (compressed). See our World Wide Web server http://ptolemy.eecs.berkeley.edu or finger ptolemy@ptolemy.eecs.berkeley.edu for complete information.

The documentation for Ptolemy0.6 is available in PostScript, HTML and PDF, together with updated summary sheets, answers to frequency asked questions, a quick tour, and a tutorial. We have set up a Usenet read news group called comp.soft-sys.ptolemy. Postings to our mailing list ptolemy-hackers@ptolemy.eecs.berkeley.edu are cross-posted to the comp.soft-sys.ptolemy and visa-versa. Postings to the news group and the e-mail group are searchable from our World Wide Web site.

2.5.1 New Domains

Multidimensional synchronous dataflow, MDSDF.
New functional VHDL domain.
Finite state machines, FSM.
Process networks, PN.
Synchronous reactive, SR.
Code generation for the TI C50 DSP, CGC50.
Extension of Boolean dataflow (BDF) to integer-controlled dataflow (IDF).

2.5.2 Schedulers

Loop scheduler.
Dynamic dataflow scheduler that maintains bounded memory.
Partitioning SDF applications into multiple VHDL hardware modules.
Buffer-optimal loop scheduler for acyclic SDF graphs.
Latency constrained resynchronization (LCR) algorithm for 2 processor systems that is capable of handling delays.

2.5.3 Automatic code generation

Synthesis of parallel code for network of workstations (NOW).
Multi-lingual code generation (C-based wormholes).
Tunable VHDL code generation (sequential or synthesizable).
Synthesis of code for the Ultrasparc visual instruction set (VIS).
Incrementally compiled code generation subsystems.
Real-time CD-quality audio on Ultrasparc workstations.

2.5.4 Visualization

Interfaces to a freely available graph visualization program: printDot - outputs a galaxy hierarchy in dotty format, printClusterDot - outputs the galaxy in it's clustered form.
Tcl/Tk versions of Gantt chart and logic analyzer.
Interface to Tycho (see below).

2.5.5 Ptolemy Infrastructure

Tcl parameter expression parser.
Ptolemy Makefile redesign.
Matlab Tcl interface.
Tcl/Mathematica interface.
First-cut at design-methodology management.
Code generation wormholes.
Portable scheduler file format.
Heterogeneous code generation(VHDL, 56K, C).
C++ documentation generation system.
HTML documentation of Ptolemy stars.
File datatype.
Automatic generation of Ptcl scripts from block diagrams.
New iterator classes for various kernel classes.
Itcl incorporation, an object-oriented extension to Tcl/Tk.
Script for creating custom versions of Ptolemy.

2.5.6 Platforms that Ptolemy runs on

Platforms that we distribute binaries for:

Solaris2.4
HPUX-10.01
SunOS4.1.3

Platforms that Ptolemy has been compiled for:

IBM AIX3.2.5
DEC Alpha OSF/1 V3.2
FreeBSD 2.1-Stable
Irix5.3
Irix6.x
HPUX9.x
Linux Slackware3.0
Solaris2.5
HPUX10.01, HP CC

2.6 Tycho software

Tycho is an object-oriented syntax manager with an underlying heterogeneous technical rationale. It provides a number of editors and graphical widgets in an extensible, reusable framework. The editors for textual syntaxes are modeled after emacs in the sense the emacs key bindings are used whenever possible. However, they make more extensive use of menus, windows, and dialogs than emacs. Also, the intent is that visual editors and visualization tools will be fully integrated, something that would be difficult to accomplish with emacs in its current form. Editors for visual syntaxes will be more diverse. The system documentation is integrated, using a hypertext system compatible with the worldwide web. Tycho was originally conceived for use with the Ptolemy system, a heterogeneous design environment from U.C. Berkeley, but it has grown into a system that is useful on its own. Tycho has been used extensively in the development of the Tycho software itself.

Tycho is written primarily in Itcl, also called [incr Tcl], developed by by Michael McLennan of AT&T. Itcl is an object-oriented extension of Tcl, a "tool command language" written by John Ousterhout of U.C. Berkeley. The window toolkit Tk and its object-oriented extension Itk are also used extensively.

The Tycho0.1 release is the first public release of Tycho as a stand-alone system. It runs under the vanilla Itcl 2.0 and 2.1 with no changes to the executable. It also runs under the Ptolemy system, a design environment distributed by the same group that developed Tycho. Tycho0.1 can be obtained from http://ptolemy.eecs.berkeley.edu/tycho/Tycho.html

The key objectives of the Tycho project are:

To build a genuinely object-oriented user interface, where multiple visual syntaxes can be combined, and application-specific visual syntaxes can be constructed.
To provide an extensible framework for experimentation with visual syntaxes, where mundane tasks such as documentation, font management, color management, and dialogs with the user are built using a shared, common infrastructure.
To extend the non-dogmatic nature of the Ptolemy kernel (which supports multiple semantic models) to the user interface (which will support multiple syntactic models).
To experiment with design visualization, broadening the perspective beyond a schematic or block-diagram perspective of Ptolemy, and exploring new visual and mixed visual/textual syntaxes for design representation and understanding.
To leverage off work in the Tcl/Tk community to get portable (Unix, Microsoft Windows PCs and Apple Macintoshes) code.
To design a sophisticated, extensible, interactive documentation system.

One of the key principles in Tycho is that anything can have a hyperlink to anything else. Documentation will have links to source code, and vice versa. Visual editors will have links to textual editors. And specialized displays can be created for any form of data. These displays, of course, are also connected by hyperlinks.

The interface to Ptolemy kernel will eventually be entirely through ptcl, the Tcl extensions that provide an interpreted command language for the Ptolemy kernel. An interim mechanism is provided where Tycho forms a subsystem within the much older visual editor for Ptolemy called "pigi" (which stands for Ptolemy interactive graphical interface).

2.6.1 Status of Tycho

Tycho 0.1 was released with Ptolemy 0.6. It was still a very preliminary system, with some syntax-sensitive editors for Itcl, HTML, C, C++, and Ptlang (Ptolemy star) files. It has progressed considerably since then, and a future release is anticipated that will include at least the following features:

View/Displayer architecture.
A graphical toolkit that manages composite objects.
Visual editors and displays for various types of graphs.
Gantt charts and related visualization tools.
Textual editors for Java and Esterel.
Interactive shells communicating with Tcl, Matlab, Mathematica.
Graphical editors
Integrated, HTML-based documentation.
Indexes and index browsers.

3. Technology transfer

3.1 Cadence uses Ptolemy in SPW3.5

The policy in the Ptolemy project has been to release software with a very liberal copyright agreement and to widely disseminate the ideas embodied in the software through publications. This strategy has once again proven effective in getting research results quickly into the marketplace in the form of serious, supported products. On October 23, 1995, The Alta Group of Cadence Design Systems announced SPW 3.5, which contains three key technologies from Ptolemy: mixing of discrete-event and dataflow models of computation, and synchronous and dynamic dataflow scheduling technology.

The subtitle of Alta's press release is:

"New SPW* Simulation Technology for Convergence Applications Leverages Berkeley's Ptolemy Project Research"

In the body of the press release:

"The new simulation architecture is based on research from the renowned Ptolemy research project at the University of California at Berkeley. ... [It] utilized the Ptolemy team's results to uniquely implement Ptolemy's advanced simulation algorithms in Alta Group's leading SPW solution."

We expect to continue to work with Cadence and others to ensure that the best results of the project make their way into self-sustaining commercial products. The full press release is available on the Ptolemy Web site.

3.2 Lockheed-Sanders develops architectural tradeoff analysis tool

Lockheed-Martin, Sanders division, has been using Ptolemy to develop tools for architectural evaluation and tradeoff analysis. Their work leverages the SDF and DE domains in Ptolemy, enhanced with their own user interface and visualization tools. Motivated by our interactions with them, we have begun a serious effort to ease the process of building and releasing third-party extensions to Ptolemy. Rather than having an ad hoc series of extensions with incompatible ways of integrating into Ptolemy, we are working on a more uniform solution.

3.3 Other commercial interactions

The following companies are known to be influenced by Ptolemy technology in their commercial activities:

Berkeley Design Technology
Cadence
Hewlett-Packard
Mitsubishi
Motorola
Structured Software Systems
Synopsys
Thomson CSF
White Eagle Systems

3.4 New Book: Software Synthesis from Dataflow Graphs

This book studies the problem of synthesizing software for embedded signal processing systems starting from applications expressed as synchronous dataflow (SDF) graphs. After a comprehensive review of the theory behind SDF, techniques are given to optimize primarily the program memory size and secondarily the data memory size. To accomplish this, SDF graphs describing multirate signal processing applications are scheduled into nested loops. A formal theory for constructing and manipulating these loops is developed, and a class of looping structures, called single appearance schedules, is shown to be the most efficient with respect to code size. The existence of such structures is studied, and algorithms for optimally constructing them are given. Extensive experimental data is presented, demonstrating the efficacy of the techniques.

3.5 Graduate Class

The new graduate class at Berkeley, "Specficication and Modeling of Reactive Real-Time Systems" is based on methods developed in part under this RASSP project. The class is described in more detail above.

3.6 POLIS -- A codesign system based on Ptolemy

The group of Prof. Alberto Sangiovanni-Vincentelli at Berkeley has released a Ptolemy-based co-design environment for control-dominated embedded systems, called POLIS. POLIS offers an integrated interactive environment for specification, co-simulation, formal verification, and synthesis of embedded systems implemented as a mix of hardware and software components. It uses and significantly extends the discrete-event (DE) domain in Ptolemy. See: http://www-cad.eecs.berkeley.edu/Respep/Research/hsc/abstract.html

4. Next Period Activities

The top priority topics that we will be addressing over the next year are:

Wrapping up with the Ptolemy0.7 release, which will be the final Ptolemy release for this project. The 0.7 release may include the following:
New domains: Finite State Machine, Synchronous/Reactive and Texas Instruments C50.
Extensible datatypes.
Integrated documentation system based on Tycho.
Ultrasparc code generation.
Buffer-optimal loop scheduler for acyclic SDF graphs.

5. Acknowledgments

We would like to give special thanks to the following people outside our group who contributed in significant ways to the results reported above:

Neal Becker (Comsat)
Shuvra S. Bhattacharyya (Hitachi)
Joseph T. Buck (Synopsys)
Tom Lane (Structured Software Systems)
Eric Pauer (Lockheed Sanders)
Xavier Warzee (Thomson CSF)
Joseph M. Winograd (Boston University)
Fritz Heinrichmeyer (FernUniverstitat in Hagen, Germany)

6. Publications

Papers

Raza Ahmed and Brian L. Evans, "Optimization of Signal Processing Algorithms", accepted by the IEEE Asilomar Conference on Signals, Systems, and Computers, to be held November 3-5, 1996, in Pacific Grove, CA.
G. Arslan, B. L. Evans, F. A. Sakarya, and J. L. Pino, "Performance Evaluation and Real-Time Implementation of Subspace, Adaptive, and DFT Algorithms for Multi-Tone Detection," Proc. Int. Conf. on Telecommunications, Istanbul, Turkey, April 15-17, 1996.
S. S. Bhattacharyya, P. K. Murthy and E. A. Lee, "Software Synthesis from Dataflow Graphs., Kluwer Academic Publishers, Norwell, Mass, 1996.
S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee, "APGAN and RPMC: Complimentary Heuristics for Translating DSP Block Diagrams into Efficient Software Implementations", to appear in the Design Automation for Embedded Systems Journal.
S. S. Bhattacharyya, S. Sriram, and E. A. Lee, "Latency-Constrained Resynchronization for Multiprocessor DSP Implementation," Proc. ASAP Conference, Chicago, August 19-21, 1996.
S. S. Bhattacharyya, S. Sriram, and E. A. Lee, "Self-Timed Resynchronization: A Post-Optimization for Static Multiprocessor Schedules," Proc. IPPS, April 1996.
S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee, "Optimal Parenthesization of Lexical Orderings for DSP Block Diagrams," in Proc. IEEE Workshop on VLSI Signal Processing, Osaka, Japan, October 16-18, 1995. .
S. S. Bhattacharyya, S. Sriram, and E. A. Lee, "Resynchronization for Embedded Multiprocessors," ERL Technical Report UCB/ERL M95/70, University of California, Berkeley, CA 94720, September, 1995.
W.-T. Chang, S.-H. Ha, and E. A. Lee, "Heterogeneous Simulation -- Mixing Discrete-Event Models with Dataflow," invited paper, RASSP special issue of the Journal on VLSI Signal Processing.
William Chen, H. John Reekie, and Edward A. Lee, "An Assessment of the UltraSparc VIS Architecture for Native Signal Processing in the Ptolemy Environment", accepted by the IEEE Asilomar Conference on Signals, Systems, and Computers, to be held November 3-5, 1996, in Pacific Grove, CA.
K. Chiang, B. L. Evans, W. T. Huang, F. Kovac, E. A. Lee, H. J. Reekie, D. G. Messerschmitt, and S. Sastry, "Real-Time DSP for Sophomores," in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Atlanta, GA, May 7-10, 1996, vol. 2, pp. 1097-1100.
Michael Goodwin, "A Nonuniform Filterbank for Auditory Modeling," accepted by the IEEE Asilomar Conference on Signals, Systems, and Computers, to be held November 3-5, 1996, in Pacific Grove, CA.
S. Ha and E. A. Lee, "Compile-Time Scheduling of Dynamic Constructs in Dataflow Program Graphs,," to appear in IEEE Trans. on Computers.
A. Kalavade and E. A. Lee, "The Extended Partitioning Problem: Hardware/Software Mapping and Implementation-Bin Selection," Journal of Design Automation of Embedded Systems, special issue on System Partitioning Issues, to appear.
Asaware Kalavade and Pratyush Moghe, "Architecting Local-Host Media Processors for Multiple Applications," accepted by the IEEE Asilomar Conference on Signals, Systems, and Computers, to be held November 3-5, 1996, in Pacific Grove, CA.
A. Kalavade and E. A. Lee, "Complexity Management in System-Level Design," Journal of VLSI Signal Processing, to appear.
Asawaree Kalavade, System Level Codesign of Mixed Hardware-Software Systems, Tech. Report UCB/ERL 95/88, Ph.D. Dissertation, Dept. of EECS, University of California, Berkeley, CA 94720, September, 1995.
Phil Lapsley, Jeff Bier, Amit Shoham, and Edward A. Lee, "DSP Processor Fundamentals: Architectures and Features," Berkeley Design Technology, Inc., Fremont, California, 1996. To be published by the IEEE Press.
P. Lapsley, J. Bier, A. Shoham, and E. A. Lee, DSP Processor Fudamentals -- Architectures and Features, published by Berkeley Design Technology, Inc.,, 39355 California St., Suite 206, Freemont, CA, 1996.
E. A. Lee and A. Sangiovanni-Vincentelli, "Comparing Models of Computation," accepted as an embedded tutorial for ICCAD, 1996.
E. A. Lee and A. Sangiovanni-Vincentelli, "The Tagged Signal Model --A Preliminary Version of a Denotational Framework for Comparing Models of Computation," ERL Memorandum UCB/ERL M96/33, University of California, Berkeley, CA 94720, June 4, 1996.
E. A. Lee, and A.Sangiovanni-Vincentelli, "Comparing Models of Computation," to appear in Proc. of ICCAD, San Jose, CA, Nov. 10-14, 1996.
R. Mani, H. S. Nawab, J. M. Winograd, and B. L. Evans, "Integrated numeric and symbolic signal processing in a heterogeneous design environment," Proc. SPIE Int. Sym. on Advanced Signal Processing Algorithms, Architectures, and Implementations, Denver, CO, August 12-15, 1996.
T. Miyazaki and E. A. Lee, "Code Generation by Using Integer- Controlled Dataflow Graphs," submitted to ICASSP '97.
P. K. Murthy, S. S. Bhattacharyya, and E. A. Lee, "Joint Minimization of Code and Data for Synchronous Dataflow Programs", accepted by the Formal Methods in System Design journal.
P. K. Murthy and E. A. Lee, "An Extension of Multidimensional Synchronous Dataflow to Handle Arbitrary Sampling Lattices," in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Atlanta, GA, May 7-10, 1996, vol. 6, pp. 3306-3309.
T. M. Parks, Bounded Scheduling of Process Networks, Technical Report UCB/ERL-95-105. PhD Dissertation. EECS Department, University of California. Berkeley, CA 94720, December 1995.
T. M. Parks, J. L. Pino, and E. A. Lee, "A Comparison of Synchronous and Cyclo-Static Dataflow," Proc. IEEE Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, October 29 - November 1, 1995.
J. L. Pino, M. C. Williamson, and E. A. Lee, "Interface Synthesis in Heterogeneous System-Level DSP Design Tools," in Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Atlanta, GA, May 7-10, 1996, vol. 2, pp. 1268-1271.
J. L. Pino, S. S. Bhattacharyya and E. A. Lee, "A Hierarchical Multiprocessor Scheduling System for DSP Applications,"Proc. IEEE Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, October 29 - November 1, 1995.
The Ptolemy Team, "System-Level Design Methodology for Embedded Signal Processors," RASSP Digest Newsletter, July, 1996.
F. Sheikh, "Visualizing Architecture and Algorithm Interaction in Embedded Systems,," MS Thesis, EECS Department, University of California. Berkeley, CA 94720, September 17, 1996.
Sundararajan Sriram, Minimizing Communication and Synchronization Overhead in Multiprocessors for Digital Signal Processing, Tech. Report UCB/ERL 95/90, Ph.D. Dissertation, Dept. of EECS, University of California, Berkeley, CA 94720, November 7, 1995.
S. Sriram and E. A. Lee, "Determining the Order of Processor Transactions in Statically Scheduled Multiprocessors," Journal of VLSI Signal Processing, to appear.
Mike C. Williamson and Edward Lee, "Synthesis of Parallel Hardware Implementations from Synchronous Dataflow Graph Specifications," accepted by the IEEE Asilomar Conference on Signals, Systems, and Computers, to be held November 3-5, 1996, in Pacific Grove, CA.

Last updated 10/05/97. Send comments to www@ptolemy.eecs.berkeley.edu.