[Hardware Design I ] FPGA Technology

Fábio Magalhães
4 min readOct 26, 2020

Current systems integrate very complex software components while trying to keep the hardware to the minimal, leading to extraordinary loads on the CPU and consequently performance problems.

Modern SoC have the highest performance-per-watt relation, faster memory operations and with the integration of FPGA fabric enable the implementation of any algorithm, or parts of one, as a hardware component, which increases the overall system performance.

FPGA

The FPGA fabric is divided, essentially, in CLB (configurable Logic Blocks) which can be used to emulated any kind of circuit, independently of its complexity or size. Different chips might implement other components, such as IOB(Input/Output Block) and Switch Matrix, which are circuits for managing the interfaces between CLB combinations and tracks used to connect CLB and IOB respectively.

Configurable Blocks

FPGA fabric is built by configurable blocks, composed by Flip-Flops and Look-up Tables as it is shown in the figure 1.

Figure 1 — FPGA fabric configurable block

The increase of the FPGA fabric usage in Embedded Systems created the need for faster designs and efficient resource utilization, so modern FPGAs implement bigger configurable blocks, containing multiplexers and digital signal processing components, such as adders and multipliers, which have increased power efficiency and performance than the LuT implementation of the same operation.

Flip-Flop

Flip-flop can hold a bit value and is used to create the well known registers present in every computer architecture. The most simple and the most used is flip-flop D that contain 2 inputs and 1 output. as it can be seen in its block diagram below.

The D input stands for the data that is to be stored at the rising edge of the clock, since flip-flops are edge-triggered. The input value will only be available in the output one clock pulse later, as it happens with all sequential logic circuits. The Flip Flop D truth table is the following :

Flip-flops are used whenever a signal is needed to hold information between clock cycles. In some HDL (Hardware Descripting Languages) the keyword reg is used to explicitly define a flip flop based variable, in other, like vhdl, if the variable is used inside a process controlled by the clock it is sinthetized as flip-flops.

Look-Up Tables

LuT components emulate the behavior of logic gates, through the creation of a custom truth table mapping the input to the respective outputs.

Any digital circuit can be implement based on the LuT components, altough its performance and silicon space are less efficient when compared to their ASIC counterpart.

How LuTs work?

The LuT component can be thought as a memory block, where each address stores a valid output value for the circuit, described through the HDL(Hardware Description Language), and the address of the memory block is formed by the input signals of the circuit.

Looking into the example of the figure 2, a simple circuit is implemented with 3 inputs and 1 output signals. The address is a concatenation of the A,B and C signals, resulting in a 3-bit address with ranges from 0 to 7. Since the circuit output is composed only by Y, the data width stored in memory is of 1-bit.

figure 2 — Add and multiply circuit

A graphic representation of the LuT can be seen in figure 3, as well as the possible values for the Y signal. For example with the signal values set to A = 1, B = 1 and C = 0, which translate to the 0x6 address and a output value of 1.

figure 3 — LuT as memory block

Large circuits can be splitted between different cell blocks, since the LuT has a maximum number of inputs, usually between 8 and 16 depending on the implementation of the FPGA fabric.

Most modern FPGA chips implement common components inside the configurable logic block, such as adders and multipliers, to achieve a better performance and better power efficiency.

--

--