The previous version could easily generate a large amount of padding
when the constant factor was significantly larger than the width of the
shift data input. This could lead to huge amounts of logic being
generated before then being optimized away at a huge performance and
memory cost.
Additionally and more critically, when the input width was not a
multiple of the constant factor, the input data was padded with 'x bits
to such a multiple before interspersing the 'x padding needed to align
the selectable windows to power-of-two offsets.
Such a final padding would not be correct for shifts besides $shiftx,
and the previous version did attempt to remove that final padding at the
end so that the native zero/sign/x-extension behavior of the shift cell
would be used, but since the last selectable window also got
power-of-two padding appended after the padding the code is trying to
remove got added, it did not actually fully remove it in some cases.
I changed the code to only add 'x padding between selectable windows,
leaving the last selectable window unpadded. This omits the need to add
final padding to a multiple of the constant factor in the first place.
In turn, that means the only 'x bits added are actually impossible to
select. As a side effect no padding is added when the constant factor is
equal to or larger than the width of the shift data input, also solving
the reported performance bug.
This fixes#4056
Renamed python script so that it sits next to the testbench file when alphabetically sorted.
Reverted `MAX_WIDTH` to full precision for truncation testing.
Expected values are now stored in full precision rather than truncating to the same value as the input.
i.e. 0x5a5a5a5a will truncate to 0x5a5a for write data but will remain 0x5a5a5a5a for expected read.
Adds TDP submodule to generator.
Adds shorthand expected signal to testbench (mostly to make it easier when I look at the vcd dump to figure out what I did wrong in tests).
Adds `double_sync_ram_sdp` to `common/blockram.v`, providing a test for two disjoint memories.
Refactor python blockram template to take a list of params to support the above.
Also change the smaller single TDP36K tests to also test `port_a_width` value.
Use python script to generate tests for both SDP and TDP across multiple sizes of RAM.
Adds sync_ram_sdp_(wwr|wrr) to common blockram.v for double width write and double width read respectively.
In preparation for substantial expansion of CXXRTL's runtime, this commit
reorganizes the files used by the implementation. Only minimal changes are
required in a consumer.
First, change:
-I$(yosys-config --datdir)/include
to:
-I$(yosys-config --datdir)/include/backends/cxxrtl/runtime
Second, change:
#include <backends/cxxrtl/cxxrtl.h>
to:
#include <cxxrtl/cxxrtl.h>
(and do the same for cxxrtl_vcd.h, etc.)
This expands the part-select tests with one additional module.
It specifically tests the different variants of the `peepopt`
optimizations `shiftadd` and `shiftmul`.
Not all these cases are actually transformed using `shiftadd`,
including them also checks if the correct variants are rejected.
- Add support for assignments within expressions, e.g., `x[y++] = z;` or
`x = (y *= 2) - 1;`. The logic is handled entirely within the parser
by injecting statements into the current procedural block.
- Add support for pre-increment/decrement statements, which are
behaviorally equivalent to post-increment/decrement statements.
- Fix non-standard attribute position used for post-increment/decrement
statements.
We add a new flow graph node type, PRINT_SYNC, as they don't get handled
with regular CELL_EVALs. We could probably move this grouping out of
the dump method.
Removing some signed checks and logic where we've already guaranteed the
values to be positive. Indeed, in these cases, if a negative value got
through (per my realisation in the signed fuzz harness), it would cause
an infinite loop due to flooring division.
We need to invoke "read_verilog" manually, since the default action on
input files is to defer processing. Under such conditions, we never
simplify the AST, and initial $prints never execute.
If the `$ge` cell we are replacing has wide output port, the upper bits
on the port should be driven to zero. That's not what a `$not` cell with
a single-bit input does. Instead opt for a `$logic_not` cell, which does
zero-pad its output.
Fixes#3867.
Extend the aigmap.ys test with SAT-based comparison of the original
cells and their AIG implementations.
This tests both the usual cells and the single-bit Yosys gates.
Distinguish between the A, B input ports of `$_ANDNOT_`, `$_ORNOT_`
gates when considering those for sharing. Unlike the input ports of the
other supported single-bit gates, those are not interchangeable.
Fixes#3848.
genvar declaration cannot take an initial value when declared as a module_or_generate_item_declaration.
Correct this test so that it doesn't fail unexpectedly if Yosys aligns with the standard.
(* nowrshmsk *) on a struct / union variable now affects dynamic
bit slice assignments to members of the struct / union.
(* nowrshmsk *) can in some cases yield significant resource savings; the
combination of pipeline shifting and indexed writes is an example of this.
Constructs similar to the one below can benefit from (* nowrshmsk *), and
in addition it is no longer necessary to split out the shift assignments
on separate lines in order to avoid the error message "ERROR: incompatible
mix of lookahead and non-lookahead IDs in LHS expression."
always_ff @(posedge clk) begin
if (rotate) begin
{ v5, v4, v3, v2, v1, v0 } <= { v4, v3, v2, v1, v0, v5 };
if (res) begin
v0.bytes <= '0;
end else if (w) begin
v0.bytes[addr] <= data;
end
end
end
When the verilog frontend perfomed constant evaluation of unbased
unsized constants in a context-determined expression it did not properly
extend them by repeating the bit value. This only affected constant
evaluation and not constants that made it through unchanged to RTLIL.
The latter case was already covered by tests and working before.
This fixes the const-eval issue by checking the `is_unsized` flag in
bitsAsConst and extending the value accordingly.
The newly added test also tests the already working non-const-eval case
to highlight that both cases should behave the same.
The difference between void functions and tasks is that always_comb's
implicit sensitivity list behaves as if functions were inlined, but
ignores signals read only in tasks. This only matters for event based
simulation, and for synthesis we can treat a void function like a task.
This uses the same constant parsing for enum_values and for attributes
and extends it to handle signed values as those are used for enums that
implicitly use the int type.
Use `$finish(0)` to silently exit even when using recent iverlog
versions. Run `write_verilog -noexpr` before `write_verilog` as the
latter can modify the design.
This also enables checking the tests results, as xprop should be in a
state where the existing tests pass.
This brings the metadata for packed arrays in packed structs
in line with the metadata for unpacked arrays, and correctly
handles the case when both lsb and msb in an address range are
non-zero.