EE382C Embedded Software Systems - Notes on Cyclo-Static Dataflow
Prof. Brian L. Evans
Introduction
In Synchronous Dataflow, each actor consumes and produces the same
number of tokens on each firing.
"Cyclo-Static Dataflow generalizes SDF by allowing the number of tokens
consumed and produced by an actor to vary from one firing to the next
in a cyclic pattern." [1]
Over a given period of firings, a CSDF actor is allowed to have a different
but static behavior each time it fires.
The behavior is periodic, so the graph can still be statically
scheduled.
CSDF has been implemented in the Graphical Rapid Prototyping
Environment (GRAPE) by Prof. Rudy Lauwereins' research group at
the Catholic University of Leuven in Belgium.
GRAPE was commercialized in 1995 under another name, and unfortunately,
freely distributed versions of GRAPE are no longer available.
Consider an actor that downsamples by 3.
In SDF, the actor cannot execute until at least 3 tokens are available
at the input.
In CSDF, we can describe the behavior in three phases.
On the first phase, it accepts one token and outputs it.
On the second and third phases, it accepts one token but outputs
nothing.
The SDF and CSDF representations for a downsampler are shown below.
The CSDF can yield dramatically smaller buffers on the arcs.
On the input arc, the SDF buffer is at least three tokens, but
the CSDF buffer is at least one token.
----> D ----> ----> D ---->
3 1 [1,1,1] [1,0,0]
(a) (b)
Figure 1: Modeling of an downsample by 3 operation in
(a) Synchronous Dataflow and (b) Cyclo-Static Dataflow.
Higher-Order Functions
The operation of dataflow actors can be described by firing rules
which determine when enough data tokens are available on the inputs
to enable the actor for execution.
When the actor executes, it consumes and produces a finite number
of tokens.
We can represent this operation by expressing the actor as a
function that maps input streams to output streams.
Consider a homogeneous actor with one input port and one output port
that is represented by a firing function F:
f ( [ x1, x2,
x3, ... ] ) =
f ( x1 )
The execution of f consumes one token from the infinite
input stream [ x1, x2,
x3, ... ] and leaves the rest of the infinite
stream in tact.
We produce an infinite output stream by firing the actor repeatedly,
which forms a dataflow process.
We can represent this functionally using higher-order functions,
using either an iterative definition
map(f) [ x1, x2,
x3, ... ] =
[ f ( x1 ),
f ( x2 ),
f ( x3 ), ... ]
or a recursive definition
map(f) [ x1, x2,
x3, ... ] =
cons( f ( x1 ),
map(f) [ f ( x2 ),
f ( x3 ), ... ] )
where cons(t, s) = [t, s], so cons
constructs a stream from a token t and a stream s.
Composition of Dataflow Graphs
The lecture on Multiprocessor
Scheduling of SDF graphs shows that SDF graphs do not compose
in general.
The lecture describes an SDF Composition Theorem that gives the
conditions under which one can cluster SDF graphs.
Once we cluster an SDF graph, we have to define a firing of the new
actor (supernode).
For SDF, we adjust the repetitions vector for the supernode to
maintain load balance.
Process Networks can be arbitrarily composed.
When dataflow actors are composed, such SDF actors, we must define
what it means to fire the composition (supernode).
If we have a homogeneous acyclic graph in which the output of node A
is connected to B, then there is only one way to fire the cluster
of A and B, which would A followed by B.
This unique firing occurs because the graph only has one topological
sort, and is called "well-ordered".
The dataflow graph in Figure 3 of [1], however, is not well-ordered
because ABC and ACB are valid sequential schedules and A followed by a
simultaneous execution of B and C is also valid.
We can utilize CSDF semantics to impose a unique order on the execution
of a cluster which avoids the possibility of deadlock.
Consider Figure 4 in [1].
Figure 4a shows an acyclic SDF graph.
It is well-ordered: only one topological sort is possible, ACB.
If we cluster nodes A and B and maintain SDF semantics, then we
introduce deadlock, as shown in Figure 4b.
However, if we apply CSDF semantics to maintain the unique topological
sort of ACB, then we avoid deadlock and preserve the behavior of
the cluster, as shown in Figure 4c.
Cyclo-Static Dataflow Scheduling
Section 2 of [1] reviews the definition of a topology matrix and the
use of a topology matrix in computing the repetitions vector for an
SDF graph.
We can apply the same concepts to CSDF graphs.
A CSDF topology matrix is actually a matrix whose elements can be
either scalars or vectors.
For the example in Figure 6 of [1], we label actor C as vertex
V1 and actor C as vertex V2 to construct
the vector topology matrix:
- -
_ | -1 1 |
G = | [1, 0] [-1, 0] |
| [0, 1] [0, -1] |
- -
This representation of a vector topology matrix is not particularly
useful.
If we flatten the topology matrix by duplicating the scalar elements an
appropriate number of times, then we obtain a matrix of rank 2.
The rank is 2, so the dimension of the null space is 2.
Therefore, there would be two solutions for the repetitions vector,
which is not allowed.
As an alternative, we model the CSDF as an SDF graph, and
solve for the repetitions vector [1].
Following the procedure in [1]:
- -
| 1 1 |
P = | 2 2 |
| 2 2 |
- -
P1 = lcm(P*1) = 2
P2 = lcm(P*2) = 2
- -
| 2 -2 |
G = | -1 1 |
| -1 1 |
- -
G r = 0
The solution for r for the augmented topology matrix G
is r = [1 1]T.
This repetitions vector indicates the number of cycles of each
actor to invoke in one periodic schedule.
The true repetitions vector is
[r1P1 r2P2]T, which
is [2 2]T.
Hence, in each period of the schedule for Figure 6 in [1], each
actor is fired twice.
The only admissible schedule is DCDC.
References
- Thomas M. Parks, Jose Luis Pino and Edward A. Lee,
"A Comparison of Synchronous and Cyclo-Static Dataflow",
Proc. IEEE Asilomar Conference on Signals, Systems, and Computers,
Nov., 1995.
Updated 04/26/04.