mirror of https://github.com/YosysHQ/yosys.git
Updated ABC info
Includes comparison of `abc` v `abc9`. Also creates a new subsection of the yosys internals for extending yosys (moving the previous extensions.rst into it). Co-authored-by: Lofty <dan.ravensloft@gmail.com>
This commit is contained in:
parent
e34a25ea27
commit
1733a76273
|
@ -9,7 +9,7 @@ yosys-config
|
||||||
|
|
||||||
The ``yosys-config`` tool (an auto-generated shell-script) can be used to query
|
The ``yosys-config`` tool (an auto-generated shell-script) can be used to query
|
||||||
compiler options and other information needed for building loadable modules for
|
compiler options and other information needed for building loadable modules for
|
||||||
Yosys. See :doc:`/yosys_internals/extensions` for details.
|
Yosys. See :doc:`/yosys_internals/extending_yosys/extensions` for details.
|
||||||
|
|
||||||
.. literalinclude:: /temp/yosys-config
|
.. literalinclude:: /temp/yosys-config
|
||||||
:start-at: Usage
|
:start-at: Usage
|
||||||
|
|
|
@ -1,39 +1,103 @@
|
||||||
The :cmd:ref:`abc` command
|
The ABC toolbox
|
||||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
---------------
|
||||||
|
|
||||||
.. TODO:: discuss abc, consider using https://github.com/Ravenslofty/yosys-cookbook/blob/master/misc/abc9.md
|
.. role:: yoscrypt(code)
|
||||||
|
|
||||||
The :cmd:ref:`abc` command provides an interface to ABC_, an open source tool
|
|
||||||
for low-level logic synthesis.
|
|
||||||
|
|
||||||
.. _ABC: http://www.eecs.berkeley.edu/~alanmi/abc/
|
|
||||||
|
|
||||||
The :cmd:ref:`abc` command processes a netlist of internal gate types and can
|
|
||||||
perform:
|
|
||||||
|
|
||||||
- logic minimization (optimization)
|
|
||||||
- mapping of logic to standard cell library (liberty format)
|
|
||||||
- mapping of logic to k-LUTs (for FPGA synthesis)
|
|
||||||
|
|
||||||
Optionally :cmd:ref:`abc` can process registers from one clock domain and
|
|
||||||
perform sequential optimization (such as register balancing).
|
|
||||||
|
|
||||||
ABC is also controlled using scripts. An ABC script can be specified to use more
|
|
||||||
advanced ABC features. It is also possible to write the design with
|
|
||||||
:cmd:ref:`write_blif` and load the output file into ABC outside of Yosys.
|
|
||||||
|
|
||||||
Example
|
|
||||||
^^^^^^^
|
|
||||||
|
|
||||||
.. todo:: describe ``abc`` images
|
|
||||||
|
|
||||||
.. literalinclude:: /code_examples/synth_flow/abc_01.v
|
|
||||||
:language: verilog
|
|
||||||
:caption: ``docs/source/code_examples/synth_flow/abc_01.v``
|
|
||||||
|
|
||||||
.. literalinclude:: /code_examples/synth_flow/abc_01.ys
|
|
||||||
:language: yoscrypt
|
:language: yoscrypt
|
||||||
:caption: ``docs/source/code_examples/synth_flow/abc_01.ys``
|
|
||||||
|
|
||||||
.. figure:: /_images/code_examples/synth_flow/abc_01.*
|
ABC_, from the University of California, Berkeley, is a logic toolbox used for
|
||||||
:class: width-helper
|
fine-grained optimisation and LUT mapping.
|
||||||
|
|
||||||
|
Yosys has two different commands, which both use this logic toolbox, but use it
|
||||||
|
in different ways.
|
||||||
|
|
||||||
|
The :cmd:ref:`abc` pass can be used for both ASIC (e.g. :yoscrypt:`abc
|
||||||
|
-liberty`) and FPGA (:yoscrypt:`abc -lut`) mapping, but this page will focus on
|
||||||
|
FPGA mapping.
|
||||||
|
|
||||||
|
The :cmd:ref:`abc9` pass generally provides superior mapping quality due to
|
||||||
|
being aware of combination boxes and DFF and LUT timings, giving it a more
|
||||||
|
global view of the mapping problem.
|
||||||
|
|
||||||
|
.. _ABC: https://github.com/berkeley-abc/abc
|
||||||
|
|
||||||
|
ABC: the unit delay model, simple and efficient
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The :cmd:ref:`abc` pass uses a highly simplified view of an FPGA:
|
||||||
|
|
||||||
|
- An FPGA is made up of a network of inputs that connect through LUTs to a
|
||||||
|
network of outputs. These inputs may actually be I/O pins, D flip-flops,
|
||||||
|
memory blocks or DSPs, but ABC is unaware of this.
|
||||||
|
- Each LUT has 1 unit of delay between an input and its output, and this applies
|
||||||
|
for all inputs of a LUT, and for all sizes of LUT up to the maximum LUT size
|
||||||
|
allowed; e.g. the delay between the input of a LUT2 and its output is the same
|
||||||
|
as the delay between the input of a LUT6 and its output.
|
||||||
|
- A LUT may take up a variable number of area units. This is constant for each
|
||||||
|
size of LUT; e.g. a LUT4 may take up 1 unit of area, but a LUT5 may take up 2
|
||||||
|
units of area, but this applies for all LUT4s and LUT5s.
|
||||||
|
|
||||||
|
This is known as the "unit delay model", because each LUT uses one unit of
|
||||||
|
delay.
|
||||||
|
|
||||||
|
From this view, the problem ABC has to solve is finding a mapping of the network
|
||||||
|
to LUTs that has the lowest delay, and then optimising the mapping for size
|
||||||
|
while maintaining this delay.
|
||||||
|
|
||||||
|
This approach has advantages:
|
||||||
|
|
||||||
|
- It is simple and easy to implement.
|
||||||
|
- Working with unit delays is fast to manipulate.
|
||||||
|
- It reflects *some* FPGA families, for example, the iCE40HX/LP fits the
|
||||||
|
assumptions of the unit delay model quite well (almost all synchronous blocks,
|
||||||
|
except for adders).
|
||||||
|
|
||||||
|
But this approach has drawbacks, too:
|
||||||
|
|
||||||
|
- The network of inputs and outputs with only LUTs means that a lot of
|
||||||
|
combinational cells (multipliers and LUTRAM) are invisible to the unit delay
|
||||||
|
model, meaning the critical path it optimises for is not necessarily the
|
||||||
|
actual critical path.
|
||||||
|
- LUTs are implemented as multiplexer trees, so there is a delay caused by the
|
||||||
|
result propagating through the remaining multiplexers. This means the
|
||||||
|
assumption of delay being equal isn't true in physical hardware, and is
|
||||||
|
proportionally larger for larger LUTs.
|
||||||
|
- Even synchronous blocks have arrival times (propagation delay between clock
|
||||||
|
edge to output changing) and setup times (requirement for input to be stable
|
||||||
|
before clock edge) which affect the delay of a path.
|
||||||
|
|
||||||
|
ABC9: the generalised delay model, realistic and flexible
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
ABC9 uses a more detailed and accurate model of an FPGA:
|
||||||
|
|
||||||
|
- An FPGA is made up of a network of inputs that connect through LUTs and
|
||||||
|
combinational boxes to a network of outputs. These boxes have specified delays
|
||||||
|
between inputs and outputs, and may have an associated network ("white boxes")
|
||||||
|
or not ("black boxes"), but must be treated as a whole.
|
||||||
|
- Each LUT has a specified delay between an input and its output in arbitrary
|
||||||
|
delay units, and this varies for all inputs of a LUT and for all sizes of LUT,
|
||||||
|
but each size of LUT has the same associated delay; e.g. the delay between
|
||||||
|
input A and output is different between a LUT2 and a LUT6, but is constant for
|
||||||
|
all LUT6s.
|
||||||
|
- A LUT may take up a variable number of area units. This is constant for each
|
||||||
|
size of LUT; e.g. a LUT4 may take up 1 unit of area, but a LUT5 may take up 2
|
||||||
|
units of area, but this applies for all LUT4s and LUT5s.
|
||||||
|
|
||||||
|
This is known as the "generalised delay model", because it has been generalised
|
||||||
|
to arbitrary delay units. ABC9 doesn't actually care what units you use here,
|
||||||
|
but the Yosys convention is picoseconds. Note the introduction of boxes as a
|
||||||
|
concept. While the generalised delay model does not require boxes, they
|
||||||
|
naturally fit into it to represent combinational delays. Even synchronous delays
|
||||||
|
like arrival and setup can be emulated with combinational boxes that act as a
|
||||||
|
delay. This is further extended to white boxes, where the mapper is able to see
|
||||||
|
inside a box, and remove orphan boxes with no outputs, such as adders.
|
||||||
|
|
||||||
|
Again, ABC9 finds a mapping of the network to LUTs that has the lowest delay,
|
||||||
|
and then minimises it to find the lowest area, but it has a lot more information
|
||||||
|
to work with about the network.
|
||||||
|
|
||||||
|
The result here is that ABC9 can remove boxes (like adders) to reduce area,
|
||||||
|
optimise better around those boxes, and also permute inputs to give the critical
|
||||||
|
path the fastest inputs.
|
||||||
|
|
||||||
|
.. todo:: more about logic minimization & register balancing et al with ABC
|
||||||
|
|
|
@ -0,0 +1,76 @@
|
||||||
|
Setting up a flow for ABC9
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
Much of the configuration comes from attributes and ``specify`` blocks in
|
||||||
|
Verilog simulation models.
|
||||||
|
|
||||||
|
``specify`` syntax
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Since ``specify`` is a relatively obscure part of the Verilog standard, a quick
|
||||||
|
guide to the syntax:
|
||||||
|
|
||||||
|
.. code-block:: verilog
|
||||||
|
|
||||||
|
specify // begins a specify block
|
||||||
|
(A => B) = 123; // simple combinational path from A to B with a delay of 123.
|
||||||
|
(A *> B) = 123; // simple combinational path from A to all bits of B with a delay of 123 for all.
|
||||||
|
if (FOO) (A => B) = 123; // paths may apply under specific conditions.
|
||||||
|
(posedge CLK => (Q : D)) = 123; // combinational path triggered on the positive edge of CLK; used for clock-to-Q arrival paths.
|
||||||
|
$setup(A, posedge CLK, 123); // setup constraint for an input relative to a clock.
|
||||||
|
endspecify // ends a specify block
|
||||||
|
|
||||||
|
By convention, all delays in ``specify`` blocks are in integer picoseconds.
|
||||||
|
Files containing ``specify`` blocks should be read with the ``-specify`` option
|
||||||
|
to ``read_verilog`` so that they aren't skipped.
|
||||||
|
|
||||||
|
LUTs
|
||||||
|
^^^^
|
||||||
|
|
||||||
|
LUTs need to be annotated with an ``(* abc9_lut=N *)`` attribute, where ``N`` is
|
||||||
|
the relative area of that LUT model. For example, if an architecture can combine
|
||||||
|
LUTs to produce larger LUTs, then the combined LUTs would have increasingly
|
||||||
|
larger ``N``. Conversely, if an architecture can split larger LUTs into smaller
|
||||||
|
LUTs, then the smaller LUTs would have smaller ``N``.
|
||||||
|
|
||||||
|
LUTs are generally specified with simple combinational paths from the LUT inputs
|
||||||
|
to the LUT output.
|
||||||
|
|
||||||
|
DFFs
|
||||||
|
^^^^
|
||||||
|
|
||||||
|
DFFs should be annotated with an ``(* abc9_flop *)`` attribute, however ABC9 has
|
||||||
|
some specific requirements for this to be valid: - the DFF must initialise to
|
||||||
|
zero (consider using ``dfflegalize`` to ensure this). - the DFF cannot have any
|
||||||
|
asynchronous resets/sets (see the simplification idiom and the Boxes section for
|
||||||
|
what to do here).
|
||||||
|
|
||||||
|
It is worth noting that in pure ``abc9`` mode, only the setup and arrival times
|
||||||
|
are passed to ABC9 (specifically, they are modelled as buffers with the given
|
||||||
|
delay). In ``abc9 -dff``, the flop itself is passed to ABC9, permitting
|
||||||
|
sequential optimisations.
|
||||||
|
|
||||||
|
Some vendors have universal DFF models which include async sets/resets even when
|
||||||
|
they're unused. Therefore *the simplification idiom* exists to handle this: by
|
||||||
|
using a ``techmap`` file to discover flops which have a constant driver to those
|
||||||
|
asynchronous controls, they can be mapped into an intermediate, simplified flop
|
||||||
|
which qualifies as an ``(* abc9_flop *)``, ran through :cmd:ref:`abc9`, and then
|
||||||
|
mapped back to the original flop. This is used in :cmd:ref:`synth_intel_alm` and
|
||||||
|
:cmd:ref:`synth_quicklogic` for the PolarPro3.
|
||||||
|
|
||||||
|
DFFs are usually specified to have setup constraints against the clock on the
|
||||||
|
input signals, and an arrival time for the Q output.
|
||||||
|
|
||||||
|
Boxes
|
||||||
|
^^^^^
|
||||||
|
|
||||||
|
A "box" is a purely-combinational piece of hard logic. If the logic is exposed
|
||||||
|
to ABC9, it's a "whitebox", otherwise it's a "blackbox". Carry chains would be
|
||||||
|
best implemented as whiteboxes, but a DSP would be best implemented as a
|
||||||
|
blackbox (multipliers are too complex to easily work with). LUT RAMs can be
|
||||||
|
implemented as whiteboxes too.
|
||||||
|
|
||||||
|
Boxes are arguably the biggest advantage that ABC9 has over ABC: by being aware
|
||||||
|
of carry chains and DSPs, it avoids optimising for a path that isn't the actual
|
||||||
|
critical path, while the generally-longer paths result in ABC9 being able to
|
||||||
|
reduce design area by mapping other logic to larger-but-slower cells.
|
|
@ -13,11 +13,11 @@ The guidelines directory contains notes on various aspects of Yosys development.
|
||||||
The files GettingStarted and CodingStyle may be of particular interest, and are
|
The files GettingStarted and CodingStyle may be of particular interest, and are
|
||||||
reproduced here.
|
reproduced here.
|
||||||
|
|
||||||
.. literalinclude:: ../temp/GettingStarted
|
.. literalinclude:: /temp/GettingStarted
|
||||||
:language: none
|
:language: none
|
||||||
:caption: guidelines/GettingStarted
|
:caption: guidelines/GettingStarted
|
||||||
|
|
||||||
.. literalinclude:: ../temp/CodingStyle
|
.. literalinclude:: /temp/CodingStyle
|
||||||
:language: none
|
:language: none
|
||||||
:caption: guidelines/CodingStyle
|
:caption: guidelines/CodingStyle
|
||||||
|
|
||||||
|
@ -87,7 +87,7 @@ Creating modules from scratch
|
||||||
|
|
||||||
Let's create the following module using the RTLIL API:
|
Let's create the following module using the RTLIL API:
|
||||||
|
|
||||||
.. literalinclude:: ../../resources/PRESENTATION_Prog/absval_ref.v
|
.. literalinclude:: ../../../resources/PRESENTATION_Prog/absval_ref.v
|
||||||
:language: Verilog
|
:language: Verilog
|
||||||
:caption: docs/resources/PRESENTATION_Prog/absval_ref.v
|
:caption: docs/resources/PRESENTATION_Prog/absval_ref.v
|
||||||
|
|
|
@ -0,0 +1,11 @@
|
||||||
|
Extending Yosys
|
||||||
|
---------------
|
||||||
|
|
||||||
|
.. todo:: brief overview for the extending Yosys index
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 3
|
||||||
|
|
||||||
|
extensions
|
||||||
|
abc_flow
|
||||||
|
|
|
@ -32,9 +32,9 @@ chapter, even though the chapter only explains the conceptual idea behind it and
|
||||||
can be used as reference to implement a similar system in any language.
|
can be used as reference to implement a similar system in any language.
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 3
|
:maxdepth: 3
|
||||||
|
|
||||||
flow/index
|
flow/index
|
||||||
formats/index
|
formats/index
|
||||||
techmap
|
extending_yosys/index
|
||||||
extensions
|
techmap
|
||||||
|
|
Loading…
Reference in New Issue