This commit adds a reader/writer implementation for a file format
optimized for fast, single-pass storage and retrieval of design state
changes, as well as a recorder/replayer that integrate with the eval
and commit simulation steps to create replay logs and reproduce them
later.
This feature makes it possible to run a simulation once, recording
the stimulus as well as changes to the registers, and navigate to
a past time point in the simulation later without rerunning it.
Both the changes in inputs (stimulus) and changes in state are saved
so that navigation does not require calling `eval()` or `commit()`;
only a series of memory copy operations.
On a representative example of a SoC netlist, saving the replay log
while simulating it takes 150% of the time it would take to simulate
the same design without logging, which is a much lower overhead than
writing an equivalent full view (including memories) VCD waveform dump.
The replay log is also several times smaller than the VCD dump, and
more space savings are available as low hanging fruit.
Replaying the log has not been optimized and currently takes about
the same time as running the simulation in first place. However, it
is still useful since it provides fast navigation to an arbitrary time
point, something that rerunning the simulation does not allow for.
The current file format should be considered preliminary. It is not
very space-efficient, and my testing shows that a lot of time is spent
in the write() syscall in the kernel. Most likely, compression and/or
writing in another thread could improve performance by 10-20%. This
may be done at a later time.
This change only matters for processes that weren't processed by
`proc_rmdead` for which follow-up cases after a default case are treated
differently in Verilog and RTLIL semantics.
The commit observer is a structure containing a callback that is invoked
whenever the `commit()` method changes a wire or a memory. This allows
code external to the compiled netlist to react to changes in the design
state in a very efficient way. One example of how this feature can be
used is an efficient implementation of record/replay.
Note that the VCD writer does not benefit from this feature because it
must be able to react to changes in any debug items and not just those
that contain design state.
While the VCD format separates the timescale and the timestep (likely
to allow representing the timestep with a small integer type), time in
CXXRTL is represented using a uniform 96-bit number, which allows for
a ±100 year range at femtosecond resolution.
The implementation uses `value<96>`, which provides fast arithmetic and
comparison operations, as well as conversion to/from a more common
representation of integer seconds plus femtoseconds.
Prior to this fix, the `CxxrtlBackend` used the entire path for the include
directive when a separated interface file is generated (via the `-header`
option). This commit updates the code to use the base name of the interface
file.
Since the C++11 standard is used by default, we cannot take advantage of
the `std::filesystem` to get the basename.
In preparation for substantial expansion of CXXRTL's runtime, this commit
reorganizes the files used by the implementation. Only minimal changes are
required in a consumer.
First, change:
-I$(yosys-config --datdir)/include
to:
-I$(yosys-config --datdir)/include/backends/cxxrtl/runtime
Second, change:
#include <backends/cxxrtl/cxxrtl.h>
to:
#include <cxxrtl/cxxrtl.h>
(and do the same for cxxrtl_vcd.h, etc.)
The Verilog grammar does not allow an empty case. Most synthesis tools
are quite permissive about this, but Quartus is not. This causes
problems for amaranth with Quartus (see amaranth-lang/amaranth#931).
We add a new flow graph node type, PRINT_SYNC, as they don't get handled
with regular CELL_EVALs. We could probably move this grouping out of
the dump method.
Removing some signed checks and logic where we've already guaranteed the
values to be positive. Indeed, in these cases, if a negative value got
through (per my realisation in the signed fuzz harness), it would cause
an infinite loop due to flooring division.
The previous implementation for finding the end of a top-level s-expr
exhibited quadratic behavior as it would re-scan the complete input for
the current expression for every new line. For large designs with
trivial properties this could easily take seconds and dominate the
runtime over the actual solving.
This change remembers the current nesting level between lines, avoiding
the re-scanning.
This include seems to have been copied over from the JSON backend where
AIG models are sometimes inserted into the JSON output, but these other
backends don't do anything with AIG.
Python 3.12 emits a SyntaxWarning when encountering invalid escape
sequences. They still parse as expected. Marking these raw produces
the same result without the warnings.
This makes clk2fflogic add an attr to $ff cells that carry the state of
the emulated async FF. The $ff output doesn't have any async updates
that happened in the current cycle, but the $ff input does, so the $ff
input corresponds to the async FF's output in the original design.
Hence this patch also makes the following changes to passes besides
clk2fflogic (but only for FFs with the clk2fflogic attr set):
* opt_clean treats the input as a register name (instead of the
output)
* rename -witness ensures that the input has a public name
* the formal backends (smt2, btor, aiger) will use the input's
name for the initial state of the FF in witness files
* when sim reads a yw witness that assigns an initial value to the
input signal, the state update is redirected to the output
This ensures that yosys witness files for clk2fflogic designs have
useful and stable public signal names. It also makes it possible to
simulate a clk2fflogic witness on the original design (with some
limitations when the original design is already using $ff cells).
It might seem like setting the output of a clk2fflogic FF to update the
input's initial value might not work in general, but it works fine for
these reasons:
* Witnesses for FFs are only present in the initial cycle, so we do
not care about any later cycles.
* The logic that clk2fflogic generates loops the output of the
genreated FF back to the input, with muxes in between to apply any
edge or level sensitive updates. So when there are no active updates
in the current gclk cycle, there is a combinational path from the
output back to the input.
* The logic clk2fflogic generates makes sure that an edge sensitive
update cannot be active in the first cycle (i.e. the past initial
value is assumed to be whatever it needs to be to avoid an edge).
* When a level sensitive update is active in the first gclk cycle, it
is actively driving the output for the whole gclk cycle, so ignoring
any witness initialization is the correct behavior.
Can now append a user defined number of steps to input traces when joining.
If the number of steps is +ve, inputs are all set to 0.
If -ve then steps are skipped.
If all of steps are skipped (including init step) then the input trace will not be copied.
If more than one input trace is provided, the append option will need to be provided the same number of times as there are input traces.
Intended for use with SCY to combine sequential cover traces.
Arbitrary number of inputs, will load all and attempt to join them. Each trace will be replayed one after the other, in the same order as the files are provided.
Mismatch in init_only fields seems to work fine, with values in subsequent traces being assigned in the initial only if they weren't previously defined.
Uncertain if a mismatch in non init_only fields will cause problems.
Fixes WitnessSig.__eq__().
Adds helper functions to WitnessValues and ReadWitness classes.
While not setting the smtoffset here was clearly a bug, I think using
`chunk.offset` only worked incidentally. The `smtoffset` is an offset
into the `smtname, smtid` pair (here `"", idcounter`) which corresponds
to the smt bitvector `stringf("%s#%d", get_id(module), idcounter)` which
contains all the chunks this loop is iterating over.
Thus using an incrementing `smtoffset` (like the `$ff`/`$dff` case above
already does) should be the correct fix.
Wires weren't being assigned an smtoffset value so when generating a yosys witness trace it would also use an offset of 0.
Not sure if this has any other effects, but it fixes the bug I was having.
@jix could you take a look at this?
This should fix#3648 where when calling `emit_elaborated_extmodules` it
checks to see if a module is a black-box, however there was no
validation that the cell type was actually known, and it just always
assumed that we would get a valid instance, causing a segfault.
The output width for the boolean value should not influence the
operation width. The previous incorrect width extension would still
produce correct results, but could produce invalid smt2 output for
reduction operators when the output width was larger than the width of
the vector to which the reduction was applied.
This fixes#3654
This is not valid when the prefix of a trace already violates
assertions. This can happen when the trace generating solver doesn't
look for a minimal length counterexample.
I have added an optional flag to smtbmc that causes failed temporal induction counterexample traces to be checked for duplicate states and reported to the user, since loops in the counterexample mean that increasing the induction depth won't help prove a design's safety properties.
The witness metadata was missing fine grained FFs completely and for
coarse grained FFs where the output connection has multiple chunks it
lacked the offset of the chunk within the SMT expression. This fixes
both, the later by adding an "smtoffset" field to the metadata.
Uses the regex below to search (using vscode):
^\t\tlog\("(.{10,}(?<!\\n)|.{81,}\\n)"\);
Finds any log messages double indented (which help messages are)
and checks if *either* there are is no newline character at the end,
*or* the number of characters before the newline is more than 80.
This adds a native json based witness trace format. By having a common
format that includes everything we support, and providing a conversion
utility (yosys-witness) we no longer need to implement every format for
every tool that deals with witness traces, avoiding a quadratic
opportunity to introduce subtle bugs.
Included:
* smt2: New yosys-smt2-witness info lines containing full hierarchical
paths without lossy escaping.
* yosys-smtbmc --dump-yw trace.yw: Dump results in the new format.
* yosys-smtbmc --yw trace.yw: Read new format as constraints.
* yosys-witness: New tool to convert witness formats.
Currently this can only display traces in a human-readable-only
format and do a passthrough read/write of the new format.
* ywio.py: Small python lib for reading and writing the new format.
Used by yosys-smtbmc and yosys-witness to avoid duplication.
This attribute can be used by formal backends to indicate which clocks
were mapped to the global clock. Update the btor and smt2 backend which
already handle clock inputs to understand this attribute.
This approach had major issues with ROMs whose initialization was not
fully defined. If required, memory_map -rom-only -keepdc should be
called early in a formal flow instead. (This does require a careful
choice of optimization passes though. Sby's scripts will be updated
accordingly.)
Before this commit, values, wires, and memories with an initializer
were value-initialized in emitted C++ code. After this commit, all
values, wires, and memories are default-initialized, and the default
constructor of generated modules calls the reset() method, which
assigns the members that have an initializer.
Before this commit, zero width sigspecs were dumped as "" (empty
string). Unfortunately, 1364-2005 5.2.3.3 indicates that an empty
string is equivalent to "\0", and is 8 bits wide, so that's wrong.
After this commit, a replication operation with a count of zero is
used instead, which is explicitly permitted per 1364-2005 5.1.14,
and is defined to have size zero. (Its operand has to have a non-zero
size for it to be legal, though.)
PR #1203 has addressed this issue before, but in an incomplete way.
- *_en is split into *_ce (clock enable) and *_aload (async load aka
latch gate enable), so both can be present at once
- has_d is removed
- has_gclk is added (to have a clear marker for $ff)
- d_is_const and val_d leftovers are removed
- async2sync, clk2fflogic, opt_dff are updated to operate correctly on
FFs with async load
backends/protobuf/protobuf.cc depends on the source and header files
generated by protoc, but this dependency wasn't explicitly declared. Add
a rule to the Makefile to fix intermittent build failures when the
protobuf header/source file isn't built before protobuf.cc.
This mode will be used whenever read port cannot be handled in the
"extract address register" way, ie. whenever it has enable, reset,
init functionality or (in the future) mixed transparency mask.