The $div and $mod cells use truncating division semantics (rounding
towards 0), as defined by e.g. Verilog. Another rounding mode, flooring
(rounding towards negative infinity), can be used in e.g. VHDL. The
new $divfloor cell provides this flooring division.
This commit also fixes the handling of $div in opt_expr, which was
previously optimized as if it was $divfloor.
The $div and $mod cells use truncating division semantics (rounding
towards 0), as defined by e.g. Verilog. Another rounding mode, flooring
(rounding towards negative infinity), can be used in e.g. VHDL. The
new $modfloor cell provides this flooring modulo (also known as "remainder"
in several languages, but this name is ambiguous).
This commit also fixes the handling of $mod in opt_expr, which was
previously optimized as if it was $modfloor.
Ensures that "BV" is the logic whenever solving an exists-forall problem with Yices, moves the "(set-logic ...)" directive above any non-info line, sets the `ef-max-iters` parameter to a very high number when using Yices in exists-forall mode so as not to prematurely abandon difficult problems, and does not provide the incompatible "--incremental" Yices argument when in exists-forall mode.
This isn't actually necessary anymore after scheduling was improved,
and `clean -purge` disrupts the mapping between wires in the input
RTLIL netlist and the output CXXRTL code.
log_assert(false) never returns and thus can't fall through, but gcc
doesn't seem to think that far. Making it the last case avoids the
problem entirely.
The current firrtl backend emits blackboxes as standard modules
with an empty body, but this causes the firrtl compiler to
optimize out entire circuits due to the absence of any drivers.
Yosys already tags blackboxes with a (*blackbox*) attribute, so this
commit just propagates this change to firrtl's syntax for blackboxes.
As per suggestion made in https://github.com/YosysHQ/yosys/pull/1987, now:
RTLIL::wire holds an is_signed field.
This is exported in JSON backend
This is exported via dump_rtlil command
This is read in via ilang_parser
Strategically inserting the pending memory write in memory::update to keep the
queue sorted allows us to skip the queue sort in memory::commit.
The Minerva SRAM SoC runs ~7% faster as a result.
This is quite possibly the worst way to implement this, but it does
work for a subset of well-behaved designs, and can be used to measure
how much performance is lost simulating the inactive edge of a clock.
It should be replaced with a clock tree analyzer generating safe
code once it is clear how should such a thing look like.
If the annotations are not used, this commit does not alter semantics
at all, other than removing elision of outputs of black box cells.
(Elision of such outputs is expected to be too rare to have any
noticeable benefit, and the implementation was somewhat of a hack.)
The (* cxxrtl.comb *) annotation alters the semantics of the output
of the black box it is applied to such that, if the black box
converges immediately, no additional delta cycle is necessary to
propagate the computed combinatorial value upwards in hierarchy.
The (* cxxrtl.sync *) annotation alters the semantics of the output
of the black box it is applied to such as to remove any uses of
the black box by the wires connected to this output, and break false
feedback arcs arising from conservative modeling of dependencies of
the black box.
Although currently these attributes are only recognized on black
boxes, if separate compilation is added in the future, it could also
emit and consume them.
The attribute for this is called (* cxxrtl.edge *), and there is
a planned attribute (* cxxrtl.sync *) that would cause blackbox
cell outputs to be added to sync defs rather than comb defs.
Rename the edge detector related stuff to avoid confusion.
Fixes#1823.
This will allow nextpnr to reuse the default value information already
present in yosys cells_sim.v and avoid duplicating (and probably
desyncing) this information.
If it is statically known that eval() will converge in one delta
cycle (that is, the second commit() will always return `false`)
because the design contains no feedback or buffered wires, then
there is no need to run the second delta cycle at all.
After this commit, the case where eval() always converges immediately
is detected and the second delta cycle is omitted. As a result,
Minerva SRAM SoC runs ~25% faster.
People judge a compiler backend by the first impression, and
the metric they judge it for is speed. -O6 does severely impact
debuggability, but it provides equally massive gains in performance,
so use it by default.
Module input wires are never set by the module, so it is unnecessary
to buffer them. Although important for all inputs, this is especially
critical for clocks, since after this commit, hierarchy levels no
longer add delta cycles. As a result, Minerva SRAM SoC runs ~73%
faster when flattened, and ~264% (!!) faster when hierarchical.
This commit changes the way edge detectors are represented in
generated code from a variable that is set in commit() and reset in
eval() to a function that considers .curr and .next of the clock
wire. Behavior remains the same. Besides being simpler to generate
and providing more opportunities for optimization, this commit paves
way for unbuffering module inputs.
Before this commit, any wire that was not driven by an output port of
exactly one comb cell would not be localized, even if there were no
feedback arcs through that wire. This would cause the wire to become
buffered and require (often quite a few) extraneous delta cycles
during evaluation. To alleviate this problem, -O5 was running
`splitnets -driver`.
However, this solution was mistaken. Because `splitnets -driver`
followed by `opt_clean -purge` would produce more nets with multiple
drivers, it would have to be iterated to fixpoint. Moreover, even if
this was done, it would not be sufficient because `opt_clean -purge`
does not currently remove wires with the `\init` attribute (and it
is not desirable to remove such wires, since they correspond to
registers and may be useful for debugging).
The proper solution is to consider the condition in which a wire
may be localized. Specifically, if there are no feedback arcs through
this wire, and no part of the wire is driven by an output of a sync
cell, then the wire holds no state and is localizable.
After this commit, the original condition for not localizing a wire
is replaced by a check for any sync cell driving it. This makes it
unnecessary to run `splitnets -driver` in the majority of cases
to get a design with no buffered wires, and -O5 no longer includes
that pass. As a result, Minerva SRAM SoC no longer has any buffered
wires, and runs ~27% faster.
In addition, this commit prepares the flow graph for introduction
of sync outputs of black boxes.
Co-authored-by: Jean-François Nguyen <jf@lambdaconcept.com>
Any buffered combinatorial wires (including, as a subset, feedback
wires) will prevent the design from always converging in one delta
cycle. Before this commit, only feedback wires were detected. After
this commit, any buffered combinatorial wires, including feedback
wires, are detected.
Co-authored-by: Jean-François Nguyen <jf@lambdaconcept.com>
Both parameters and attributes are necessary because the parameters
have to be the same between every instantiation of the cell, but
attributes may well vary. For example, for an UART PHY, the type
of the PHY (tty, pty, socket) would be a parameter, but configuration
of the implementation specified by the type (socket address) would
be an attribute.
This commit adds support for replacing RTLIL modules with CXXRTL
black boxes. Black box port widths may not depend on the parameters
with which it is instantiated (yet); the parameters may only be used
to change the behavior of the black box.
The $paramod name mangling is not invertible (the \ character, which
separates the module name from the parameters, is valid in the module
name itself), which does not stop people from trying to invert it.
This commit makes it easy to invert the name mangling by storing
the original name explicitly, and fixes the firrtl backend to use
the newly introduced attribute.
There is no practical benefit from using `const memory` for ROMs;
it uses an std::vector internally, which prevents contemporary
compilers from constant-propagating ROM contents. (It is not clear
whether they are permitted to do so.)
However, there is a major benefit from using non-const `memory` for
ROMs, which is the ability to dynamically fill the ROM for each
individual simulation.
This commit makes it possible to use several cxxrtl-generated files
in one application, as well as compiling cxxrtl-generated code as
a separate compilation unit.
Also, fix the semantics of SET/CLR inputs of the $dffsr cell, and
fix the scheduling of async FF cells to consider ARST/SET/CLR->Q
as a forward combinatorial arc.
This commit reduces space and time overhead for writable memories
to O(write port count) in both cases; implements handling for write
port priorities; and simplifies runtime representation of memories.
Hierarchical design simulations are generally much slower, but this
comes with a major increase in flexibility:
1. Since the `flatten` pass currently does not support flattening
of designs with processes, this is the only way to simulate such
designs with cxxrtl.
2. Support for hierarchy paves way for simulation black boxes,
which are necessary for e.g. replacing PHYs with C++ code that
integrates with the host system.
After this commit, if NDEBUG is not defined, out-of-bounds accesses
cause assertion failures for reads and writes. If NDEBUG is defined,
out-of-bounds reads return zeroes, and out-of-bounds writes are
ignored.
This commit also adds support for memories that start with a non-zero
index (`Memory::start_offset` in RTLIL).
This results in further massive gains in performance, modest decrease
in compile time, and, for designs without feedback arcs, makes it
possible to run eval() once per clock edge in certain conditions.
In the past I was calling the ILANG_BACKEND::dump_const() to dump
values to an output stream. When these values were strings, the
function used to add quotes around them. The firrtl compiler, in turn,
escaped these quotes and the result was double-quoted strings which
were hard to read.
However I'm now calling design_entity->get_src_attribute() directly
and there is no additional quote being put around it, so we can
safely remove the unnecessary call to str.erase() here.
Yosys puts quotes around the string that represents the fileinfo whereas
firrtl does not. So when firrtl sees quotes, it escapes them with an extra
backslash which makes it hard to read afterwards.
Modifies smt2 backend to recognize `$anyconst` etc. assigned to a wire with the `maximize` or `minimize` attribute and emit `; yosys-smt2-maximize` or `; yosys-smt2-minimize` directives as appropriate.
Modifies `backends/smt2/smtbmc.py` and `smtio.py` to recognize those directives and emit a `(maximize ...)` or `(minimize ...)` command at the end of `smt_forall_assert()`, as described in the paper "νZ - An Optimizing SMT Solver" by Nikolaj Bjørner et al.
Adds an example `examples/smtbmc/demo9.v` to show how it can be used.
This increases compatibility with certain older parsers in some cases
that worked before commit 15fae357 but do not work with the current
compat-int mode
Before, the rules for encoding parameters in JSON were as follows:
- if the parameter is not a string:
- if it is exactly 32 bits long and there are no z or x bits, emit it
as an int
- otherwise, emit it as a string made of 0/1/x/z characters
- if the parameter is a string:
- if it contains only 0/1/x/z characters, append a space at the end
to distinguish it from a non-string
- otherwise, emit it directly
However, this caused a problem in the json11 parser used in nextpnr:
yosys emits unsigned ints, and nextpnr parses them as signed, using
the value of INT_MIN for values that overflow the signed int range.
This caused destruction of LUT5 initialization values. Since both
nextpnr and yosys parser can also accept 32-bit parameters in the
same encoding as other widths, let's just remove that special case.
The old behavior is still left behind a `-compat-int` flag, in case
someone relies on it.