Introduction to Synchronous Dataflow
Synchronous Dataflow (SDF) is a model first proposed by Edward A. Lee in 1986.
In SDF, all computation and data communication is scheduled statically.
That is, algorithms expressed as SDF graphs can always be converted into
an implementation that is guaranteed to take finite-time to complete all
tasks and use finite memory.
Thus, an SDF graph can be executed over and over again in a periodic
fashion without requiring additional resources as it runs.
This type of operation is well-suited to digital signal processing and
communications systems which often process an endless supply of data.
An SDF graph consists of nodes and arcs.
Nodes represent operations which are called actors.
Arcs represent data values called tokens which stored in
first-in first-out (FIFO) queues.
The word token is used because each data values can represent any
data type (e.g. integer or real) or any data structure (e.g. matrix
or image).
SDF graphs obey the following rules:
- An actor is enabled for execution when enough tokens are available
on all of the inputs, and source actors are always enabled.
- When an actor executes, it always produces and consumes the same
fixed amount of tokens.
- The flow of data through the graph may not depend on values of the data.
- Delay is a property of an arc, and a delay of n samples means
that n tokens are initially in the queue of that arc.
Because of the second rule, the data that an actor consumes is removed
from the buffers on the input arcs and not restored.
Once an actor finishes execution, the input data is removed from the
input queues (circular buffers).
The consequence of the third rule is that an SDF graph may not contain
data-dependent switch statements such as an if-then-else construct and
data-dependent iteration such as a for loop.
However, the actors may contain these constructs because the scheduling
of an SDF graph is independent of the what tasks the actors do.
Example
This example is taken from Figure 1.5 of [1].
Considered the feedforward (acyclic) synchronous dataflow graph shown below:
A ------> B ------> C
20 10 20 10
The notation means that when A executes, it produces 20 tokens.
When B executes, it consumes 10 tokens and produces 20 tokens.
When C executes, it consumes 10 tokens.
The first step in scheduling an SDF graph for execution is that we must
figure out how many times to execute each actor so that all of the
intermediate tokens that are produced get consumed.
This process is known as load balancing.
Load balancing is implemented by an algorithm that is linear in time
and memory in the size of the SDF graph.
The size of an SDF graph, as we will discover in more detail later, is
#nodes + #arcs * (1 + log2 delayPerArc +
log2 inputTokensPerArc +
log2 outputTokensPerArc)
where log2 gives the number of bits used to represent
the integer argument.
In the example SDF graph above, we must
- Fire A 1 time
- Fire B 2 times
- Fire C 4 times
to balance the number of tokens produced and consumed.
However, load balancing does not tell us the order in which to schedule
the firings.
If there were no constraints on the order, then the number of possible
schedules would be combinatoric in the total number of executions (seven
in this case).
If at least one valid schedule exists, then the SDF graph is
called consistent.
The next step is to schedule the firings required by load balancing so
as to resolve the data dependencies.
Due to rate changes in the graph, the worst case for scheduling is a
polynomial function of an exponential function of the size of the SDF graph.
The polynomial function is shown next for two scheduling algorithms:
- list scheduling - quadratic algorithm
- looped scheduling - cubic algorithm
Two variants on looped schedulers, the complementary
algorithms called pairwise grouping of adjacent nodes [2] and recursive
partitioning based on minimum cuts [2], avoid the exponential penalty.
Instead, they are cubic in the size of the SDF graph.
These scheduling algorithms are discussed in [1] and will be covered
later in the class.
Possible schedules for the above SDF graph is ABCCBCC for the list
scheduler and A (2 B(2 C)) for the looped scheduler.
The generated code to execute the schedule A (2 B(2 C)) would be
the following:
code block for A
for (i = 0; i < 2; i++) {
code block for B
for (j = 0; j < 2; j++) {
code block for C
}
}
The schedule A (2 B(2 C)) is an example of a single-appearance
schedule since the invocation of each actor only appears once.
When generating code that is "stitched" together, a single-appearance
schedule requires the minimal amount of program memory because the
code for each actor only appears once.
The scheduling algorithms could actually return several different valid
schedules, such as those shown below.
1.
| List Scheduler
| ABCBCCC
| 50
|
2.
| Looped Scheduler
| A (2 B(2 C))
| 40
|
3.
| Looped Scheduler
| A(2 B)(4 C)
| 60
|
4.
| Looped Scheduler
| A(2 BC)(2 C)
| 50
|
The smallest amount of buffer memory possible is 40, which is met
by schedule #2.
It is optimal in terms of data memory usage.
The list scheduler could also have created a data optimal schedule
of ABCCBCC, which is just the expanded version of schedule #2.
Because schedule #2 is a single-appearance schedule, we know that it
is optimal in terms of program memory usage.
References
- Shuvra S. Bhattacharyya, Praveen K. Murthy, and Edward A. Lee,
Software
Synthesis from Dataflow Graphs,
Kluwer Academic Press, Norwell, MA, ISBN 0-7923-9722-3, 1996.
- S. S. Bhattacharyya, P. K. Murthy, and E. A. Lee,
``APGAN
and RPMC: Complimentary Heuristics for Translating DSP Block Diagrams
into Efficient Software Implementations'',
Design Automation for Embedded Systems Journal,
to appear.
Updated 02/08/00.