

## Multicycle-Path Challenges in Multi-Synchronous Systems

IC Design Research Laboratory

G. Engel<sup>1</sup>, J. Ziebold<sup>1</sup>, J. Cox<sup>2</sup>, T. Chaney<sup>2</sup>, M. Burke<sup>2</sup>, and Mike Gulotta<sup>3</sup>

<sup>1</sup>Department of Electrical and Computer Engineering, IC Design Research Laboratory Southern Illinois University Edwardsville, II, USA, 62026-1801

> <sup>2</sup>Blendics, Inc. 10176 Corporate Square Drive St. Louis, MO 63132-2924

> <sup>3</sup>Independent FPGA Consultant

ICCAD 2012 (MSCAS Workshop)

San Jose, CA

November 8, 2012



## **Presentation Overview**

- Background
- Description of DANI
- DANI for FPGA applications
- DANI for ASIC SoCs
- Summary





## The Problem

In a recent survey conducted by Real Intent at the 2012 DAC Conference, 315 attendees were asked the question:

"How many clock domains do you expect it (i.e your next design) will have?"

Here were the responses:

#### 1 in 3 of all new designs are expected to possess more than 50 clock domains!





The IC Design Research Laboratory at Southern Illinois University Edwardsville (SIUE) has been working closely with Blendics, Inc. over the past several years.

This collaboration has produced an ANoC innovation called **DANI**.

#### **Delay-tolerant Asynchronous Network Interface**



#### <u>DANI</u>

#### Cores appear asynchronous to one another!





### **DANI** Source





#### **DANI** Destination





### **DANI** in FGPA Designs



A data bus route, belonging to a design supplied to us from a potential customer, was floor-planned in a Virtex LX155 Xilinx FGPA. The route was stretched to 10 ns in order to emulate the delays expected in larger and more modern chips . The bus was 256 bits wide.





- While the initial design using a single-cycle, synchronous interconnect was limited to 100 MHz, both the design employing *DANI* and a comparable single register pipeline version of the design operated at the specified 175 MHz rate.
- Moreover, the design employing DANI was capable of operating at 300 MHz while the pipelined version was able to operate at 300 MHz only if a second pipeline register were added.
- DANI latency was measured at 2 clock cycles while library asynchronous FIFOs (Xilinx and Altera) displayed latencies from 4 to 8 clock cycles.
- And, the design utilizing DANI required <u>30% less</u> time to build (MAP, PAR) compared to a single register pipeline version of the design.



## DANI in ASIC Designs

Currently exploring the use of *DANI* in SoC designs.

- Attempting to implement DANI using only standard cells and currently available clocked-CAD design tools/flows.
- Using Cadence's RTL Compiler<sup>®</sup> and Silicon Encounter<sup>®</sup> tools along with a generic 45 nm open-source standard cell library from Nangate.
- Using traditional SDCs augmented by a collection of Tcl scripts.





### ASIC Test Designs



- □ Able to reliably control skew on extended data bus between a *DANI* source and *DANI* destination while meeting signal integrity (SI) constraints. <u>Bus widths have ranged from 16 to 1024 bits</u>.
- The DANI source and DANI destination modules operate in excess of 1 GHz.
- □ We have successfully controlled skew in a design containing multiple (*i.e.* 3) DANI sources and destinations.





#### **ASIC Test Designs**



We are currently working on a design which will contain 100 copies of a wrapped processor. The wrapper contains a *DANI* destination and *DANI* source. The wrapped processors are connected in a ring and the datapath between processors is 16 bits wide. Each wrapped processor occupies 0.35 mm<sup>2</sup>.





- ✓ An FPGA circuit with DANI met required timing at 175MHz, but a comparable design employing a single-cycle, synchronous route failed to implement because it could not meet timing. Bus and clock skew controlled in DANI version so as to make possible operation up to 300 MHz.
- ✓ DANI latency in the FPGA design was measured at 2 clock cycles while library asynchronous FIFOs displayed latencies from 4 to 8 clock cycles.
- ✓ The build time for the DANI-based circuit was 30% less than that of a comparable register-pipelined circuit.
- ✓ DANI on ASICs appears viable. Able to reliably control skew on data bus between a DANI source and DANI destination while meeting signal integrity (SI) constraints. <u>Bus widths</u> <u>have ranged from 16 to 1024 bits.</u>
- ✓ Additional ASIC experiments planned in near future.



## <u>Acknowledgements</u>

- This material is based upon work supported by the National Science Foundation under Grant No. 0924010.
- Additional support was provided by the National Innovation Fund, Omaha, NE.





# **QUESTIONS?**