Electronic System Design Laboratory
Royal Institute of Technology,
Stockholm, Sweden
In the paper we will:
- characterize CMISTs from the high-level synthesis viewpoint,
- describe memory access optimization and scheduling which directly
affects size and performance of the system to be synthesised,
- present a controller synthesis algorithm which extracts an explicit
controller compared with an implicit controller extracted by the most
of the high level synthesis systems,
- present the results of applying the synthesis methodology by
applying it to various designs and compare with results obtained by
designing manually or using commercially available HLS tools.
The principal strategy for area optimization in most HLS tools is the reuse of RTL components. Among the RTL components the optimization of arithmetic functional units seems to be the primary concern. Since there is very little arithmetic operations in CMIST the strategy will not work very well.
Separating memories at earlier stages of the synthesis specialized logic-level synthesis tools can be used for better implementation of different parts of the design. By changing the width of the memory words and rescheduling the memory accesses the synthesis results can be made faster or smaller depending on the requirements.
Our main concern is to extract the controller part in a manner that allows the best use of the back-end tools dedicated for FSM optimization. The proposed controller synthesis algorithm takes into account potential parallelism of different parts of the algorithm and allows to synthesize a hierarchical network of parallel FSMs. The activation order of FSMs is defined by their master-slave relationship - master activates it's slaves and continues execution of its own algorithm until the point where it must wait for the end signal from a slave. This approach guarantees parallel execution of independent parts of the algorithm, as well the required synchronization at certain points of the algorithm.
The extracted controllers can be subjects for later composition/decomposition of FSMs to achieve smaller area, shorter synthesis times, better testability etc. By merging the master controller with its slaves one can get smaller and faster designs but because of the rapid increase in complexity the synthesizability and testability are worse.
The experiments with the industrial examples show that the proposed
approach can give much better results compared to the commercial HLS
tools; twice smaller with equal performance and in some cases even
comparable with manual designs. For instance a sub-block of an ASIC
which is part of a GSM base station had the gate counts as follows:
- manual design 8000,
- CMIST oriented tool 13058,
- commercial HLS tool no.1 26566 and
- commercial HLS tool no.2 29977.
The overall timing characteristics were comparable because all three HLS tools where constrained by time and asked to optimize area.