Adds note on `+/`.
Clarifies that we can't entirely skip loading `cells_sim.v`, and then mentions it again later once we need it.
More on final steps (and synthesis outputs).
These are useful for formal verification with SBY where they can be used
to display solver chosen `rand const reg` signals and signals derived
from those.
The previous error message for non-constant initial $display statements
is downgraded to a log message. Constant initial $display statements
will be shown both during elaboration and become part of the RTLIL so
that the `sim` output is complete.
This allows tools like SBY to capture the $display output independent
from anything else sim might log. Additionally it provides source and
hierarchy locations for everything printed.
This makes tests/verilog/dynamic_range_lhs.v pass, after ensuring that
nowrshmsk is actually tested.
Stride is extracted from indexing of two-dimensional packed arrays and
variable slices on the form dst[i*stride +: width] = src, and is used
to optimize the generated CASE block.
Also uses less confusing variable names for indexing of lhs wires.
This commit adds a reader/writer implementation for a file format
optimized for fast, single-pass storage and retrieval of design state
changes, as well as a recorder/replayer that integrate with the eval
and commit simulation steps to create replay logs and reproduce them
later.
This feature makes it possible to run a simulation once, recording
the stimulus as well as changes to the registers, and navigate to
a past time point in the simulation later without rerunning it.
Both the changes in inputs (stimulus) and changes in state are saved
so that navigation does not require calling `eval()` or `commit()`;
only a series of memory copy operations.
On a representative example of a SoC netlist, saving the replay log
while simulating it takes 150% of the time it would take to simulate
the same design without logging, which is a much lower overhead than
writing an equivalent full view (including memories) VCD waveform dump.
The replay log is also several times smaller than the VCD dump, and
more space savings are available as low hanging fruit.
Replaying the log has not been optimized and currently takes about
the same time as running the simulation in first place. However, it
is still useful since it provides fast navigation to an arbitrary time
point, something that rerunning the simulation does not allow for.
The current file format should be considered preliminary. It is not
very space-efficient, and my testing shows that a lot of time is spent
in the write() syscall in the kernel. Most likely, compression and/or
writing in another thread could improve performance by 10-20%. This
may be done at a later time.
Filling out the hardware mapping sections, and actually highlighting the changes in schematics instead of just the memory block.
Also includes Part 4 of the coarse-grain rep, looking at `memory_collect` and putting the `synth_ice40 -top fifo -run :map_ram` command in its own (sub)section.
Includes a `no_rw_check` section label in `memory.rst` for reference (because I can't remember how to reference by heading).
Not sure about the opt output after map_ram section which has an open TODO, and the final steps section is also still open.
This change only matters for processes that weren't processed by
`proc_rmdead` for which follow-up cases after a default case are treated
differently in Verilog and RTLIL semantics.
The commit observer is a structure containing a callback that is invoked
whenever the `commit()` method changes a wire or a memory. This allows
code external to the compiled netlist to react to changes in the design
state in a very efficient way. One example of how this feature can be
used is an efficient implementation of record/replay.
Note that the VCD writer does not benefit from this feature because it
must be able to react to changes in any debug items and not just those
that contain design state.
While the VCD format separates the timescale and the timestep (likely
to allow representing the timestep with a small integer type), time in
CXXRTL is represented using a uniform 96-bit number, which allows for
a ±100 year range at femtosecond resolution.
The implementation uses `value<96>`, which provides fast arithmetic and
comparison operations, as well as conversion to/from a more common
representation of integer seconds plus femtoseconds.