The main speedup is accomplished by avoiding a heap allocation in the common case where the final string length is less than 128. Inlining stringf & vstringf adds an additional improvement.
The main speedup comes from swithing from using a SHA1 hash to std::hash<std::string>. There is no need to use an expensive cryptographic hash for fingerprinting in this context.
This PR speeds up by roughly 17% across a wide spectrum of designs
tested at Google. Particularly for the mux generation pass.
Co-authored-by: Rasmus Larsen <rmlarsen@google.com>
Signed-off-by: Ethan Mahintorabi <ethanmoon@google.com>
This does not correctly handle an `$overwrite_tag` on a module output,
but since we currently require the user to flatten the design for
cross-module dft, this cannot be observed from within the design, only
by manually inspecting the signals in the design.
Fix mistaking the read-port and write-port indices for each other when
we are adding the partial transparency emulation to be able to merge two
write ports.
Remove superfluous curly braces in call to IdString::in to address
a compilation error (reproduced below) under GCC 9 and earlier.
kernel/cellaigs.cc:395:18: error: call to member function 'in' is ambiguous
if (cell->type.in({ID($gt), ID($ge)}))
~~~~~~~~~~~^~
./kernel/rtlil.h:383:8: note: candidate function
bool in(const std::string &rhs) const { return *this == rhs; }
^
./kernel/rtlil.h:384:8: note: candidate function
bool in(const pool &rhs) const { return rhs.co...
^
Removing some signed checks and logic where we've already guaranteed the
values to be positive. Indeed, in these cases, if a negative value got
through (per my realisation in the signed fuzz harness), it would cause
an infinite loop due to flooring division.
e.g. `$displayh(8'ha)` won't have a padding set, because it just gets
`lzero` set instead by `compute_required_decimal_places`.
It also doesn't have a width. In this case, we can just fill in a dummy
(unused) padding. Either space or zero would work, but space is a bit
more distinct given the width field follows.
Also omit writing the width if it's zero. This makes the emitted ilang
a little cleaner in places; `{8:> h0u}` is the output for this example,
now. The other possible extreme would be `{8:>00h0u}`.
For input like "{", "{1", etc., we would exit the loop due to
`i < fmt.size()` no longer being the case, and then check if
`++i == fmt.size()`. That would increment i to `fmt.size() + 1`,
and so execution continues.
The intention is to move i beyond the ':', so we do it only in that
case instead.
The guard is optimised out on some compilers under certain conditions (eg: LTO on GCC) as constant under C++ lifetime rules.
This is because the guard type's member is invalid to access (UB) after the type has been destroyed, resulting in
`destruct_guard.ok` being unable to be `false` according to the optimiser, based on the lifetime rules.
This patch still invokes UB (all accesses to the destroyed IdString instance are), but at least the optimiser
can't reason that destruct_guard_ok cannot be false and therefore it's safe to optimise out from its guard role.
It's a repeating pattern to print an error message tied to an AST
node. Start using an 'input_error' helper for that. Among other
things this is beneficial in shortening the print lines, which tend
to be long.
Later in the check() code we check the bottom wide_log2 bits on the
address port are zeroed out. If the address port is too narrow, we crash
due to out of bounds access. Explicitly assert the address port is wide
enough, so we don't crash on input such as
read_rtlil <<EOF
module \top
wire input 1 \clk
memory width 8 size 2 \mem
cell $memwr $auto$:1:$8
parameter \PRIORITY 1'0
parameter \CLK_POLARITY 1'1
parameter \CLK_ENABLE 1'1
parameter \MEMID "\\mem"
parameter \ABITS 1'0
parameter \WIDTH 6'010000
connect \DATA 16'0000000000000000
connect \ADDR { }
connect \EN 16'0000000000000000
connect \CLK \clk
end
end
EOF
memory
The return value of the min(...) call is never used.
Looks like some leftover from some previous implementation.
Signed-off-by: Henner Zeller <h.zeller@acm.org>
Avoids errors in trailing comma handling, broken indentation and
improper escaping that is common when building JSON by manually
concatenating strings.
This contains parsing code as well as generic routines to associate the
hierarchical signals paths within a Yosys witness file to a loaded RTLIL
design, including support for memories.
This adds the xprop_decoder attribute to bwmuxes that drive the original
unencoded signals. Setundef is changed to ignore the x inputs of these
bwmuxes, so that they survive the prep script of SBY's formal flow. This
is required to make simulation (via sim) using the prep model show the
decoded x signals instead of 0/1 values made up by the solver.
This makes it possible for yosys commands to return values when invoked
as tcl commands. Right now no commands natively support this, but the
tee command can be used with json output like this:
```tcl
set stat [yosys tee -q -s result.json stat -json -top top]
dict get $stat modules \\top num_cells_by_type \$pmux
```
Or with newline separated lists like this:
```tcl
split [yosys tee -q -s result.string select -list top] "\n"
```
The new bitwise case equality (`$bweqx`) and bitwise mux (`$bwmux`)
cells enable compact encoding and decoding of 3-valued logic signals
using multiple 2-valued signals.
When writing VCDs smtbmc replaces square brackets with angle brackets to
avoid the issues with VCD readers misinterpreting such signal names.
For memory addresses it also uses angle brackets and hexadecimal
addresses, while other tools will use square brackets and decimal
addresses.
Previously the code handled both forms of memory addresses, assuming
that any signal that looks like a memory address is a memory address.
This is not the case when the user uses regular signals whose names
include square brackets _or_ when the verific frontend generates such
names to represent various constructs.
With this change all angular brackets are turned into square brackets
when reading the trace _and_ when performing a signal lookup. This means
no matter which kind of brackets are used in the design or in the VCD
signals will be matched. This will not handle multiple signals that are
the same apart from replacing square/angle brackets, but this will cause
issues during the VCD writing of smtbmc already.
It still uses the distinction between square and angle brackets for
memories to decide whether the address is hex or decimal, but even if
something looks like a memory and is added to the `memory_to_handle`
data, the plain signal added to `name_to_handle` is used as-is, without
rewriting the address.
This last change is needed to successfully match verific generated
signal names that look like memory addresses while keeping memories
working at the same time. It may cause regressions when VCD generation
was done with a design that had memories but simulation is done with a
design where the memories were mapped to registers. This seems like an
unusual setup, but could be worked around with some further changes
should this be required.
* Change simlib's $mux cell to use the ternary operator as $_MUX_
already does
* Stop opt_expr -keepdc from changing S=x to S=0
* Change const eval of $mux and $pmux to match the updated simlib
(fixes sim)
* The sat behavior of $mux already matches the updated simlib
The verilog frontend uses $mux for the ternary operators and this
changes all interpreations of the $mux cell (that I found) to match the
verilog simulation behavior for the ternary operator. For 'if' and
'case' expressions the frontend may also use $mux but uses $eqx if the
verilog simulation behavior is requested with the '-ifx' option.
For $pmux there is a remaining mismatch between the sat behavior and the
simlib behavior. Resolving this requires more discussion, as the $pmux
cell does not directly correspond to a specific verilog construct.
This attribute can be used by formal backends to indicate which clocks
were mapped to the global clock. Update the btor and smt2 backend which
already handle clock inputs to understand this attribute.
These can be used to protect undefined flip-flop initialization values
from optimizations that are not sound for formal verification and can
help mapping all solver-provided values in witness traces for flows that
use different backends simultaneously.
This pull request is to address YosysHQ/yosys#2980.
The documentation, as originally written, does not make it clear that yosys commands, when used within a tcl script, do not return any value to the tcl script.
This pull request notes this and offers a workaround via tee as noted in the issue.
POSIX defines $TMPDIR as containing the pathname of the directory where
programs can create temporary files. On most systems, this variable points to
"/tmp". However, on some systems it can point to a different location.
Without respecting this variable, yosys fails to run on such systems.
Signed-off-by: Mohamed A. Bamakhrama <mohamed@alumni.tum.de>
- Prevent unmatched expected error patterns from self-matching
- Prevent infinite recursion on unmatched expected warnings
- Always print the error message for unmatched error patterns
- Add test coverage for all unmatched message types
- Add test coverage for excess matched logs and warnings
- Attempt to lookup a derived module if it potentially contains a port
connection with elaboration ambiguities
- Mark the cell if module has not yet been derived
- This can be extended to implement automatic hierarchical port
connections in a future change
- FfData now keeps track of the module and underlying cell, if any (so
calling emit on FfData created from a cell will replace the existing cell)
- FfData implementation is split off to its own .cc file for faster
compilation
- the "flip FF data sense by inserting inverters in front and after"
functionality that zinit uses is moved onto FfData class and beefed up
to have dffsr support, to support more use cases
- *_en is split into *_ce (clock enable) and *_aload (async load aka
latch gate enable), so both can be present at once
- has_d is removed
- has_gclk is added (to have a clear marker for $ff)
- d_is_const and val_d leftovers are removed
- async2sync, clk2fflogic, opt_dff are updated to operate correctly on
FFs with async load
This code now takes the AST nodes of type AST_BIND and generates a
representation in the RTLIL for them.
This is a little tricky, because a binding of the form:
bind baz foo_t foo_i (.arg (1 + bar));
means "make an instance of foo_t called foo_i, instantiate it inside
baz and connect the port arg to the result of the expression 1+bar".
Of course, 1+bar needs a cell for the addition. Where should that cell
live?
With this patch, the Binding structure that represents the construct
is itself an AST::AstModule module. This lets us put the adder cell
inside it. We'll pull the contents out and plonk them into 'baz' when
we actually do the binding operation as part of the hierarchy pass.
Of course, we don't want RTLIL::Binding to contain an
AST::AstModule (since kernel code shouldn't depend on a frontend), so
we define RTLIL::Binding as an abstract base class and put the
AST-specific code into an AST::Binding subclass. This is analogous to
the AST::AstModule class.
This also aligns the functionality:
- in all cases, the onehot attribute is used to create appropriate
constraints (previously, opt_dff didn't do it at all, and share
created one-hot constraints based on $pmux presence alone, which
is unsound)
- in all cases, shift and mul/div/pow cells are now skipped when
importing the SAT problem (previously only memory_share did this)
— this avoids creating clauses for hard cells that are unlikely
to help with proving the UNSATness needed for optimization
While this helper is already useful to squash sequential initializations
into one in cxxrtl, its main purpose is to squash overlapping masked memory
initializations (when they land) and avoid having to deal with them in
cxxrtl runtime.
Transparency is meaningless for asynchronous ports, so we assume
transparent == false to simplify the code in this case. Likewise,
enable is meaningless, and we assume it is const-1. However,
turns out that nMigen emits the former, and Verilog frontend emits
the latter, so squash these issues when ingesting a $memrd cell.
Fixes#2811.
This essentially adds wide port support for free in passes that don't
have a usefully better way of handling wide ports than just breaking
them up to narrow ports, avoiding "please run memory_narrow" annoyance.
When converting a sync transparent read port with const address to async
read port, nothing at all needs to be done other than clk_enable change,
and thus we have no FF cell to return. Handle this case correctly in
the helper and in its users.
When extracting read register from a transparent port that has an
enable, reset, or initial value, the usual trick of putting a register
on the address instead of data doesn't work. In this case, create soft
transparency logic instead.
When transparency masks land, this will also be used to handle ports
that are transparent to only a subset of write ports.
Like wide port support, this is still completely unusable, and support
in various passes will be gradually added later. It also has no support
at all in the cell library, so attempting to create a read port with
a reset or initial value will cause an assert failure for now.
Since the packed cell doesn't actually support wide ports yet, we just
auto-narrow them on emit. The future packed cell will add
RD_WIDE_CONTINUATION and WR_WIDE_CONTINUATION parameters so the
transform will be trivially reversible for proper serialization.
Such ports cannot actually be created or used yet, this just adds the
necessary plumbing in the helper. Subsequent commits will gradually
add wide port support to various yosys passes.
This is going to be used to store arbitrary priority masks in the
future. Right now, it is not supported by our cell library, so the
priority_mask is computed from port order on helper construction,
and discarded when emitted. However, this allows us to already convert
helper-using passes to the new model.
There will soon be more (versioned) memory cells, so handle passes that
only care if a cell is memory-related by a simple helper call instead of
a hardcoded list.
cell_inputs and cell_outputs retain cell pointers as their keys across
invocations of setup(), which may however be invalidated in the meantime
(as happens in e.g. passes/opt/share.cc:1432). A later rehash of the
dicts (caused by inserting in ModWalker::add_wire()) will cause them to
be dereferenced.
Among other problems, this also fixes equality comparisons between
SigSpec by enforcing a canonical form.
Also fix another minor issue with possible non-canonical SigSpec.
Fixes#2623.
When an expected logger error pattern is unmatched, the logger raises
another (hidden) error. Because of the previous ordering of actions,
`logv_error_with_prefix()` would inadvertently invoke `yosys_atexit()`
twice, causing a double-free.
This change set contains a number of bug fixes and improvements related to
scoping and resolution in generate and procedural blocks. While many of the
frontend changes are interdependent, it may be possible bring the techmap
changes in under a separate PR.
Declarations within unnamed generate blocks previously encountered issues
because the data declarations were left un-prefixed, breaking proper scoping.
The LRM outlines behavior for generating names for unnamed generate blocks. The
original goal was to add this implicit labelling, but doing so exposed a number
of issues downstream. Additional testing highlighted other closely related scope
resolution issues, which have been fixed. This change also adds support for
block item declarations within unnamed blocks in SystemVerilog mode.
1. Unlabled generate blocks are now implicitly named according to the LRM in
`label_genblks`, which is invoked at the beginning of module elaboration
2. The Verilog parser no longer wraps explicitly named generate blocks in a
synthetic unnamed generate block to avoid creating extra hierarchy levels
where they should not exist
3. The techmap phase now allows special control identifiers to be used outside
of the topmost scope, which is necessary because such wires and cells often
appear in unlabeled generate blocks, which now prefix the declarations within
4. Some techlibs required modifications because they relied on the previous
invalid scope resolution behavior
5. `expand_genblock` has been simplified, now only expanding the outermost
scope, completely deferring the inspection and elaboration of nested scopes;
names are now resolved by looking in the innermost scope and stepping outward
6. Loop variables now always become localparams during unrolling, allowing them
to be resolved and shadowed like any other identifier
7. Identifiers in synthetic function call scopes are now prefixed and resolved
in largely the same manner as other blocks
before: `$func$\func_01$tests/simple/scopes.blk.v:60$5$\blk\x`
after: `\func_01$func$tests/simple/scopes.v:60$5.blk.x`
8. Support identifiers referencing a local generate scope nested more
than 1 level deep, i.e. `B.C.x` while within generate scope `A`, or using a
prefix of a current or parent scope, i.e. `B.C.D.x` while in `A.B`, `A.B.C`,
or `A.B.C.D`
9. Variables can now be declared within unnamed blocks in SystemVerilog mode
Addresses the following issues: 656, 2423, 2493
The only difference between "RTLIL" and "ILANG" is that the latter is
the text representation of the former, as opposed to the in-memory
graph representation. This distinction serves no purpose but confuses
people: it is not obvious that the ILANG backend writes RTLIL graphs.
Passes `write_ilang` and `read_ilang` are provided as aliases to
`write_rtlil` and `read_rtlil` for compatibility.
This parameter will resolve to the name of the cell being mapped. The
first user of this parameter will be synth_intel_alm's Quartus output,
which requires a unique (and preferably descriptive) name passed as
a cell parameter for the memory cells.
The new types include:
- FFs with async reset and enable (`$adffe`, `$_DFFE_[NP][NP][01][NP]_`)
- FFs with sync reset (`$sdff`, `$_SDFF_[NP][NP][01]_`)
- FFs with sync reset and enable, reset priority (`$sdffs`, `$_SDFFE_[NP][NP][01][NP]_`)
- FFs with sync reset and enable, enable priority (`$sdffce`, `$_SDFFCE_[NP][NP][01][NP]_`)
- FFs with async reset, set, and enable (`$dffsre`, `$_DFFSRE_[NP][NP][NP][NP]_`)
- latches with reset or set (`$adlatch`, `$_DLATCH_[NP][NP][01]_`)
The new FF types are not actually used anywhere yet (this is left
for future commits).
Previously this was tagged only with YS_ATTRIBUTE(noreturn), but not
YS_NORETURN, so it got lost in #2173, resulting in warnings in
frontends/ast/simplify.cc:
frontends/ast/simplify.cc:267:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn]
}
^
frontends/ast/simplify.cc:379:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn]
}
^
Upgrading to WASI SDK 11.0 caused the WASM build to fail because WASM
does not have signals. (Arguably Yosys was broken even before, it was
just broken silently.)
The $div and $mod cells use truncating division semantics (rounding
towards 0), as defined by e.g. Verilog. Another rounding mode, flooring
(rounding towards negative infinity), can be used in e.g. VHDL. The
new $divfloor cell provides this flooring division.
This commit also fixes the handling of $div in opt_expr, which was
previously optimized as if it was $divfloor.
The $div and $mod cells use truncating division semantics (rounding
towards 0), as defined by e.g. Verilog. Another rounding mode, flooring
(rounding towards negative infinity), can be used in e.g. VHDL. The
new $modfloor cell provides this flooring modulo (also known as "remainder"
in several languages, but this name is ambiguous).
This commit also fixes the handling of $mod in opt_expr, which was
previously optimized as if it was $modfloor.
Before this patch, the code passed around std::string objects by
value. It's probably not a hot-spot, but it can't hurt to avoid the
copying.
Removing the copy and clean-up code means the resulting code is ~6.1kb
smaller when compiled with GCC 9.3 and standard settings.
The existing code does a search to figure out whether id is in the
dict (with the call to count()), and then looks it up again to get the
result (with the call to at()). This version calls find() instead,
avoiding the double lookup.
Code size increases slightly (6kb). I think this is because the
contents of find() are getting inlined, and then inlined into lots of
the callsites for cell() and wire().
Looking at the compiled code before this patch, you just get
a (non-inlined) call to count() followed by a call to at(). After the
patch, the contents of find() have been inlined (so you see do_hash,
then do_lookup). The result for each function is about 30 bytes / 40%
bigger, which presumably also enlarges call-sites that inline it.
There was a handwritten copy constructor, which I'm not sure was
actually legal C++ (it unconditionally read from the 'data' member of
a union, which wouldn't have been written if wire was true). It was
also a bit less efficient than the constructor you get from the
compiler by default (which is allowed to just copy the memory).
This gives a marginal (~0.25%) decrease in code size when compiled
with GCC 9.3.