PUBLICATIONS of the DSP DESIGN GROUP and the PTOLEMY PROJECT

Part I: Fundamental Concepts and Unbounded Latency Analysis

Part II: Latency Constrained Resynchronization

Sundararajan Sriram, Texas Instruments, and

Edward A. Lee, U. C. Berkeley

This paper introduces a technique, called resynchronization, for reducing synchronization overhead in embedded multiprocessor implementations. The technique exploits the well-known observation [35] that in a given multiprocessor implementation, certain synchronization opera tions may be redundant in the sense that their associated sequencing requirements are ensured by other synchronizations in the system. The goal of resynchronization is to introduce new synchro nizations in such a way that the number of additional synchronizations that become redundant exceeds the number of new synchronizations that are added, and thus the net synchronization cost is reduced.

Our study is based in the context of self-timed execution of iterative dataflow programs. An iterative dataflow program consists of a dataflow representation of the body of a loop that is to be iterated infinitely; dataflow programming in this form has been employed extensively, particu larly in the context of software for digital signal processing applications. Self-timed execution refers to a combined compile-time/run-time scheduling strategy in which processors synchronize with one another only based on inter-processor communication requirements, and thus, synchro nization of processors at the end of each loop iteration does not generally occur.

After reviewing our model for the analysis of synchronization overhead, we define the general form of our resynchronization problem; we show that optimal resynchronization is intrac table by establishing a correspondence to the set covering problem; and based on this correspon dence, we develop an efficient heuristic for resynchronization. Also, we show that for a broad class of iterative dataflow graphs, optimal resynchronizations can be computed by means of an efficient polynomial-time algorithm. We demonstrate the utility of our resynchronization tech niques through a practical example of a music synthesis system. Introduction.

The companion paper [7] introduced the concept of resynchronization, a post-optimization for static multiprocessor schedules in which extraneous synchronization operations are introduced in such a way that the number of original synchronizations that consequently become redundant significantly exceeds the number of additional synchronizations. Redundant synchronizations are synchronization operations whose corresponding sequencing requirements are enforced com pletely by other synchronizations in the system. The amount of run-time overhead required for synchronization can be reduced significantly by eliminating redundant synchronizations [5, 32].

Thus, effective resynchronization reduces the net synchronization overhead in the implementation of a multiprocessor schedule, and improves the overall throughput. However, since additional serialization is imposed by the new synchronizations, resyn chronization can produce significant increase in latency. The companion paper [7] develops fun damental properties of resynchronization and studies the problem of optimal resynchronization under the assumption that arbitrary increases in latency can be tolerated ("unbounded-latency resynchronization"). Such an assumption is valid, for example, in a wide variety of simulation applications. This paper addresses the problem of computing an optimal resynchronization among all resynchronizations that do not increase the latency beyond a prespecified upper bound . Our study is based in the context of self-timed execution of iterative dataflow programs, which is an implementation model that has been applied extensively for digital signal processing systems.

Send comments to Shuvra S. Bhattacharyya at shuvra@halsrl.com.