VHDL hardware design - tips and tricks


The following page is a collection of my notes concerning hardware design using VHDL.

This is a work in progress. If you have any suggestions/comments/corrections, please let me know!

Table of contents:
1. FPGA architecture
2. Think like an architect, not a programmer!


 

1. FPGA architecture

Digital circuits can be broken down into two categories: combinational and sequential.

1a. Combinational logic

The output of combinational logic is dependent on the present input only. One example of this is a full-adder. This component adds three one-bit numbers, A, B and a carry-input. It's output is a two-bit sum, consisting of S and a carry-out. The carries can be cascaded to produce larger adders.

This can be realized with logic gates...


And can be described by a truth table...

Inputs

Outputs

A

B

C-in

C-out

Sum

0

0

0

0

0

1

0

0

0

1

0

1

0

0

1

1

1

0

1

0

0

0

1

0

1

1

0

1

1

0

0

1

1

1

0

1

1

1

1

1


To realize combinational logic, an FPGA stores all of the possible outputs for the given function in a small memory called a look-up table (LUT). FPGA LUTs can have anywhere from 3 to 8 inputs. The inputs to the function act as the address to a 1-bit memory of possible outputs. In effect, the FPGA emulates logic gates with memory!

1b. Sequential logic

Sequential logic relies on both present and past inputs. To achieve this, sequential logic necessitates the use of storage elements. A common storage element is the edge-triggered D-flip-flop (DFF).

The notched input of the DFF represents a clock input. At the instant the clock changes from 0 to 1 (rising edge), the DFF captures the input at D, stores it, and outputs it to Q. The complement of Q is also available. S represents a force-set, while R represents a force-reset.

D and Q are referred to as synchronous. They act with the clock.
R and S are referred to as asynchronous. They act independent of the clock.

The DFF represents a 1-bit storage element. DFFs are not the only storage element used in digital circuits. There are asynchronous storage elements known as latches and other varieties of clocked flip-flops, but DFFs are the only storage available within the FPGA logic elements.

(Transparent latches can technically be implemented using LUTs, but this is bad design practice. This requires that a LUT's output feedback into itself. The FPGA design tools cannot properly analyze the timing properties of this, so it cannot say with certainty whether or not a latch will behave properly.)

1c. FPGA logic elements

The FPGA contains arrays of logic elements (LEs) connected together with a programmable mesh. LEs contain a LUT for combinational operations, a DFF for storage, along with logic for the routing. This allows the FPGA to perform a wide variety of digital circuits.

Below is the architecture of a typical LE, from Altera's Cyclone II FPGA:

1d. Summary

  Combinational Logic Sequential Logic
Dependent on Present input only Present and past inputs
Used for

Arithmetic (Adders, Multipliers, etc)
Multiplexers/Demultiplexers
Encoders/Decoders

Storage (Registers, Memories, etc)
State Machines

FPGA implementation

Look-up tables (LUTs)

D-flip-flops (DFFs)

 

2. Think like an architect, not a programmer!

When writing VHDL, the most important thing to keep in mind that you are describing hardware. VHDL may syntactically resemble a programming language, but it isn't one!

You must think of yourself as an architect rather than a programmer. You are describing hardware structures that will be embodied on a chip, not a list of instructions for a chip to follow. Code that makes sense in a procedural language will often not work as expected in a hardware description language.

Suppose we want to count to 200 by 2's. Consider the following Java snippet:

int A = 0;

for (int i = 1; i <= 100, i++){
     A = A + 2;
}

This code will execute serially, counting by two each time. The loop counter "i" keeps track of the number of iterations. After each iteration, the counter is incremented. This counter is checked against the run condition, "<= 100". While this condition is met, the innards of the for loop are executed, and 2 is added to A.

iteration 1: A = 2
iteration 2: A = 4
iteration 3: A = 6
......
iteration 99: A = 198
iteration 100: A = 200

(loop exits)

Next, consider a VHDL process that looks similar:

process(input)
   variable
A : integer;
begin
   A := input;
   for
i in 1 to 100 loop
     
A := A + 2;
   end loop;

   output <= A;
end process;

This description creates a chain of one-hundred 32-bit adders!


This is a bad hardware design!
1. This is a considerable amount of logic. This will quickly fill – if not, exceed – the capacity of a typical FPGA.
2. This is a purely combinational description. The additions will propogate through every time that A is changed.
3. This logic has a very, very long critical path, so it cannot be operated quickly.

The following is a proper alternative:

process(clk, rst)
    if rst = '1' then

            A <= 0;
    elsif clk'event and clk = 1 then
        if (A < 200) then
            A <= A + 2;
        end if;   
    end if;
end process;

This description creates the following hardware structure:

 

This design incorporates a register; when a rising edge of the clock occurs, the signal present at the "in" port is stored within the register A and becomes visible on the "out" port. The output of the register is incremented by two, then fed back into the register. Thus, at each successive rising-edge of the clock, the value stored in the register is incremented by two. The register's value is compared against 200, and this comparison is used to enable the register. As configured, if the value in register A exceeds 200, the register is disabled. Additionally, a reset signal is available that will clear the register back to 0, and allow this sequence to occur again.

This hardware will operate in the following manner. This behavior is close to the original Java code!

reset: A = 0
rising-edge 1: A = 2
rising-edge 2: A = 4
rising-edge 3: A = 6
...
rising-edge 99: A = 198
rising-edge 100: A = 200
rising-edge 101: A = 200

rising-edge 102: A = 200
...

As you can see, hardware design requires a change of mindset. You cannot simply sit down and write code as you would for a procedural programming language. You'll get really crappy hardware! In particular, take note that for-loops, if-statements and case-statements all unwrap into logic! Remember: you are not telling a chip what to do. Rather, you are telling a chip what to become!

3. VHDL concurrent statements

4. VHDL sequential statements

5. Concurrency & avoiding multi-source errors

6. Process priorities