A natively flexible 32-bit Arm microprocessor

Implementation

To just take whole advantage of the really automated, quickly flip-about implementation and verification supplied by modern day silicon integrated circuit structure flows, we developed a compact typical mobile library. A normal mobile library is a assortment of compact pre-verified building blocks from which significantly larger and a lot more advanced models can be swiftly and conveniently built employing complex electronic style and design automation instruments these types of as synthesis, put and route.

Before the implementation of the common cell library could commence, some preliminary investigations have been carried out to ascertain the most suited conventional cell architecture for the library presented the constraints of the goal know-how. The mobile architecture is the set of options that are typical to each individual cell in the library, such as cell top, energy strap sizing, routeing grid and so on, which let the cells to be snapped collectively in a typical way to form larger constructions. These typical options are largely ruled by the structure procedures of the manufacturing procedure but are also motivated by the general performance and region demands of the closing layout.

After the mobile architecture was established, the subsequent action was to figure out the content material of the cell library not only in conditions of wide range of logic capabilities but also the range of travel energy variants of every single logic purpose. Since the hard work included to style and design, put into practice and characterize just about every conventional mobile is sizeable, it was resolved to run some trials with a smaller prototype library and then to grow the library as essential. To assess the performance of this tiny prototype conventional cell library some basic agent circuits (this kind of as ring oscillators, counters and shift arrays) had been implemented, created and examined.

We migrated from 1.-μm layout principles to the new FlexIC .8-μm style regulations to minimize location and, therefore, enhance yield. As this meant redrawing each mobile in the library with more compact transistors, we took the opportunity also to change the typical cell architecture to involve MT1 (steel-tracking 1) pins to make it simpler for the router to hook up the cells. Enhancements to the resistive materials (bigger sheet resistance, Rs) also enabled a 3× reduction in the dimensions of the resistors.

This dramatic reduction in the two transistor and resistor size lowered the region of most cells by about 50% (see Prolonged Info Fig. 1), which in transform enhanced the production produce by bringing down the over-all sizing of the structure. On the other hand, as there ended up still production produce issues that we could further mitigate by improvements to the regular mobile architecture, the library was redrawn once again. This time we concentrated on issues that would strengthen the over-all yield of the ultimate style, these types of as the inclusion of redundant vias and contacts, reducing the quantity of vertices in the source–drain polygons (the place doable) and preserving the size of stacked transistors to a minimum. In addition, we reverted to a reduced sheet resistance in buy to improve the system unfold but we were being in a position to manage the spot financial savings by applying narrower resistors. To make improvements to the general excellent of the logic synthesis a number of intricate AND-OR-INVERT and OR-AND-INVERT logic gates were added to the library as nicely as some high-drive-strength straightforward logic gates, these as NAND2_X2 and NOR2_X2.

The FlexLogIC procedure is an NMOS course of action and so relies on a resistive load to pull the mobile output in the direction of the power source to travel a logic 1. As a consequence of this, the cell output rise moments are substantially slower than the fall periods and this asymmetry can have an impact on functionality, specially for intensely loaded nets. To increase the timing on vital nets, this sort of as the clock, we additional buffers with an lively transistor pull-up. Even though these lively pull-ups raise the area by a compact amount, they do have the additional profit of decreasing the static energy use. Layouts and simulated transfer qualities of buffers with resistive pull-up and active transistor pull-up are shown in Prolonged Facts Fig. 2.

This easy normal cell library was then properly employed as the concentrate on engineering to employ the PlasticARM SoC making use of a typical silicon integrated circuit style circulation based on business regular electronic style and design automation instruments. The regular cell library contents and cell usage information are proven in Prolonged Facts Table 1.

As we do not nevertheless have a focused static random obtain memory FlexIC, we developed a simple sign-up file by thoroughly inserting some modified standard cells in a tiled array that related by abutment to form a 32 × 32 little bit memory (this block can be noticed in the chip format in Fig. 1c).

The FlexLogIC technological innovation (see Extended Facts Desk 2) has four routable steel levels of which only the lessen two ended up utilized within the conventional cells. This left the prime two metallic levels no cost to be made use of for the interconnect involving the standard cells, which could then be routed around the leading of any neighbouring cells leading to a a lot-enhanced over-all gate density of about 300 gates for every mm2.

Fabrication

Approach parameters and statistical variants of TFT parameters are summarized in Prolonged Data Table 2. FlexLogIC is a proprietary 200-mm wafer semiconductor production course of action that produces patterned levels of metallic-oxide skinny-movie transistors and resistors, with four routable (gold-free of charge) metallic layers deposited on to a adaptable polyimide substrate according to the FlexIC layout. Recurring instances of the FlexIC structure are understood by operating multiple sequences of slim-film materials deposition, patterning and etching. For relieve of managing and to make it possible for industry typical method tools to be used and sub-micrometre patterned features to be achieved (down to .8 μm), the adaptable polyimide substrate is spin-coated on to glass at the outset of manufacturing. The system has been optimized to ensure that the thickness variation is considerably considerably less than 3% more than a lateral length of 20 mm. Thin-movie content deposition is realized via a combination of physical-vapour deposition, atomic-layer deposition and resolution-processing (for illustration, spin-coating). Substrate processing circumstances have been meticulously optimized to minimise movie pressure and substrate bow. Characteristic patterning is accomplished applying a photolithographic 5× stepper instrument, which illustrations or photos a shot that is recurring at many circumstances across the 200-mm-diameter wafer. Every shot is targeted independently, which further more compensates for any thickness variation in the spun-cast movie. The technological know-how measurements had been carried out working with approach control checking buildings.

Simulation, check and validation

We captured the timing qualities of the functional PlasticARM FlexIC utilizing a examination measurement setup, and compared the measured success with the effects of its sign-up-transfer degree (RTL) simulation in buy to validate the performance.

The RTL simulation is proven in Extended Info Fig. 3. It starts by resetting the PlasticARM to a acknowledged point out by placing a RESET input to ‘0’. Then, RESET is set to ‘1’, the processor is introduced from its reset state and begins executing the code from ROM. At first, the GPIO[0] output pin is toggled at the time ahead of the three checks described in Fig. 2 are executed. In the first test, facts are read through and added to an accumulator from the ROM, and the sum is in contrast in opposition to an expected value (see Fig. 2a). If values match, a short burst of two pulses is sent to GPIO[0] as demonstrated in Extended Facts Fig. 3a. If values are distinct, the time period and obligation cycle of pulses on GPIO[0] is enhanced in Extended Details Fig. 3b. In the 2nd check (Fig. 2b), details are published to RAM, go through back again and in contrast. If data has not been corrupted while composing or studying from the RAM, a small burst of 3 pulses is sent to GPIO[0] as revealed in Extended Info Fig. 3a. If information was corrupted, the period and duty cycle of pulses on GPIO[0] is elevated as prior to. In the last check (Fig. 2c), the processor enters an infinite loop and actions the time a ‘1’ is utilized on the GPIO[1] input pin. If GPIO[1] is held at ‘1’ devoid of any glitches for prolonged more than enough, GPIO[0] improvements from ‘0’ to ‘1’. PlasticARM was applied with a clock frequency of 20 kHz. Considering the fact that it does not use any timers, a value was decided on in application to represent the GPIO[1] sign becoming held at ‘1’ for approximately 1 s when working at 20 kHz. In our simulations in Extended Information Fig. 3a, that price corresponds to 20,459 clock cycles, which at 20 kHz yields 1.02295 s.

Immediately after fabrication, PlasticARM was tested on a wafer probe station though however attached to a glass provider. The enter alerts like a clock signal had been produced externally with a ZC702 FPGA Analysis Board from Xilinx. The two input and output alerts ended up captured making use of a Saleae Logic Professional 16 logic analyser. Measurements ended up carried out at 3 V and 4.5 V, with many clock frequencies. An experiment with electricity provide established to 3 V and clock frequency of 20 kHz is revealed in Prolonged Information Fig. 4. The ZC702 I/O voltage caps the inputs and outputs to 2.5 V. The calculated information waveform is shown in Extended Info Fig. 4a, and matches the waveform in the RTL simulation of all three checks in Extended Facts Fig. 3a. PlasticARM is entirely purposeful up to 29 kHz at 3 V and 40 kHz at 4.5 V.