EE382C Embedded Software Systems - Notes on Cyclo-Static Dataflow

Prof. Brian L. Evans

Introduction

In Synchronous Dataflow, each actor consumes and produces the same number of tokens on each firing. "Cyclo-Static Dataflow generalizes SDF by allowing the number of tokens consumed and produced by an actor to vary from one firing to the next in a cyclic pattern." [1] Over a given period of firings, a CSDF actor is allowed to have a different but static behavior each time it fires. The behavior is periodic, so the graph can still be statically scheduled. CSDF has been implemented in the Graphical Rapid Prototyping Environment (GRAPE) by Prof. Rudy Lauwereins' research group at the Catholic University of Leuven in Belgium. GRAPE was commercialized in 1995 under another name, and unfortunately, freely distributed versions of GRAPE are no longer available.

Consider an actor that downsamples by 3. In SDF, the actor cannot execute until at least 3 tokens are available at the input. In CSDF, we can describe the behavior in three phases. On the first phase, it accepts one token and outputs it. On the second and third phases, it accepts one token but outputs nothing. The SDF and CSDF representations for a downsampler are shown below. The CSDF can yield dramatically smaller buffers on the arcs. On the input arc, the SDF buffer is at least three tokens, but the CSDF buffer is at least one token.

---->  D  ---->               ---->  D  ---->
      3 1                    [1,1,1]  [1,0,0]

      (a)                           (b)      
Figure 1: Modeling of an downsample by 3 operation in (a) Synchronous Dataflow and (b) Cyclo-Static Dataflow.

Higher-Order Functions

The operation of dataflow actors can be described by firing rules which determine when enough data tokens are available on the inputs to enable the actor for execution. When the actor executes, it consumes and produces a finite number of tokens. We can represent this operation by expressing the actor as a function that maps input streams to output streams. Consider a homogeneous actor with one input port and one output port that is represented by a firing function F:

f ( [ x1, x2, x3, ... ] ) = f ( x1 )

The execution of f consumes one token from the infinite input stream [ x1, x2, x3, ... ] and leaves the rest of the infinite stream in tact. We produce an infinite output stream by firing the actor repeatedly, which forms a dataflow process. We can represent this functionally using higher-order functions, using either an iterative definition

map(f) [ x1, x2, x3, ... ] = [ f ( x1 ), f ( x2 ), f ( x3 ), ... ]
or a recursive definition
map(f) [ x1, x2, x3, ... ] = cons( f ( x1 ), map(f) [ f ( x2 ), f ( x3 ), ... ] )

where cons(t, s) = [t, s], so cons constructs a stream from a token t and a stream s.

Composition of Dataflow Graphs

The lecture on Multiprocessor Scheduling of SDF graphs shows that SDF graphs do not compose in general. The lecture describes an SDF Composition Theorem that gives the conditions under which one can cluster SDF graphs. Once we cluster an SDF graph, we have to define a firing of the new actor (supernode). For SDF, we adjust the repetitions vector for the supernode to maintain load balance.

Process Networks can be arbitrarily composed. When dataflow actors are composed, such SDF actors, we must define what it means to fire the composition (supernode). If we have a homogeneous acyclic graph in which the output of node A is connected to B, then there is only one way to fire the cluster of A and B, which would A followed by B. This unique firing occurs because the graph only has one topological sort, and is called "well-ordered". The dataflow graph in Figure 3 of [1], however, is not well-ordered because ABC and ACB are valid sequential schedules and A followed by a simultaneous execution of B and C is also valid.

We can utilize CSDF semantics to impose a unique order on the execution of a cluster which avoids the possibility of deadlock. Consider Figure 4 in [1]. Figure 4a shows an acyclic SDF graph. It is well-ordered: only one topological sort is possible, ACB. If we cluster nodes A and B and maintain SDF semantics, then we introduce deadlock, as shown in Figure 4b. However, if we apply CSDF semantics to maintain the unique topological sort of ACB, then we avoid deadlock and preserve the behavior of the cluster, as shown in Figure 4c.

Cyclo-Static Dataflow Scheduling

Section 2 of [1] reviews the definition of a topology matrix and the use of a topology matrix in computing the repetitions vector for an SDF graph. We can apply the same concepts to CSDF graphs. A CSDF topology matrix is actually a matrix whose elements can be either scalars or vectors. For the example in Figure 6 of [1], we label actor C as vertex V1 and actor C as vertex V2 to construct the vector topology matrix:
     -               - 
_   |   -1      1     |
G = | [1, 0]  [-1, 0] |
    | [0, 1]  [0, -1] |
     -               - 
This representation of a vector topology matrix is not particularly useful. If we flatten the topology matrix by duplicating the scalar elements an appropriate number of times, then we obtain a matrix of rank 2. The rank is 2, so the dimension of the null space is 2. Therefore, there would be two solutions for the repetitions vector, which is not allowed.

As an alternative, we model the CSDF as an SDF graph, and solve for the repetitions vector [1]. Following the procedure in [1]:

     -      - 
    |  1  1  |
P = |  2  2  |
    |  2  2  |
     -      - 

P1 = lcm(P*1) = 2
P2 = lcm(P*2) = 2

     -       - 
    |  2  -2  |
G = | -1   1  |
    | -1   1  |
     -      - 

G r = 0
The solution for r for the augmented topology matrix G is r = [1 1]T. This repetitions vector indicates the number of cycles of each actor to invoke in one periodic schedule. The true repetitions vector is [r1P1 r2P2]T, which is [2 2]T. Hence, in each period of the schedule for Figure 6 in [1], each actor is fired twice. The only admissible schedule is DCDC.

References

  1. Thomas M. Parks, Jose Luis Pino and Edward A. Lee, "A Comparison of Synchronous and Cyclo-Static Dataflow", Proc. IEEE Asilomar Conference on Signals, Systems, and Computers, Nov., 1995.


Updated 04/26/04.