Commit Graph

10495 Commits

Author SHA1 Message Date
Larry Doolittle 84c0b5c690 passes/pmgen/pmgen.py: trivial change to remove C++ compiler warnings
Verified that the result still builds and passes self-tests
2020-12-23 14:38:25 -08:00
Zachary Snow 999eec5617 genrtlil: fix mux2rtlil generated wire signedness 2020-12-22 17:49:16 -07:00
Yosys Bot 832f6aa777 Bump version 2020-12-23 00:10:07 +00:00
Zachary Snow 8206546c45 Fix constants bound to single bit arguments (fixes #2383) 2020-12-22 17:01:03 -07:00
whitequark d15c63effc
Merge pull request #2499 from whitequark/cxxrtl-fixes
cxxrtl: don't crash generating debug information for unused wires
2020-12-22 12:00:38 +00:00
whitequark f14074d2c2 cxxrtl: don't crash generating debug information for unused wires. 2020-12-22 06:51:38 +00:00
whitequark 2b62b5ef34
Merge pull request #2498 from StefanBruens/Fix_opt_lut
Fix use-after-free in LUT opt pass
2020-12-22 06:15:04 +00:00
whitequark 4949ef1247
Merge pull request #2497 from whitequark/cxxrtl-reflow
cxxrtl: completely rewrite netlist layout code
2020-12-22 06:12:39 +00:00
whitequark 7378194169 cxxrtl: split processes into sync and case nodes.
Similar to the treatment of black boxes, splitting processes into two
scheduling nodes adds sufficient freedom so that netlists with
well-behaved processes (e.g. those emitted by nMigen) can immediately
converge.

Because processes are not emitted into edge-triggered regions, this
approach has comparable performance to -O5 (without -noproc), which
is substantially slower than -O6.
2020-12-22 03:48:09 +00:00
whitequark ac988cfac5 kernel: undef Tcl macros interfering with cxxrtl. 2020-12-22 03:48:09 +00:00
whitequark b2221c1077 cxxrtl: completely rewrite netlist layout code.
The exact shape of C++ code emitted by CXXRTL has a critical effect
on performance, both compile-time and runtime. CXXRTL's performance
greatly improved when it started localizing and inlining wires, not
only because this assists the optimizer and register allocator, but
also because inlining code into edge-triggered regions cuts the time
spent in eval() by at least a factor of two.

However, the logic of netlist layout has always been ad-hoc, fragile,
and very hard to understand and modify. After commit ece25a45, which
introduced outlining, the same logic started being applied to two
distinct netlists at once instead of one, which barely worked.

This commit does four major changes:
  * There is now a single unambiguous source of truth (per subgraph)
    for the layout of any emitted wire.
  * Netlist layout is now done entirely during analysis using well
    known graph algorithms; no graph operations happen when emitting.
  * Netlist layout now happens completely separately for eval() and
    debug_eval() subgraphs.
  * Unreachable (within subgraph scope) netlist nodes are now neither
    emitted nor considered for wire inlining decisions.
The netlist layout code should also now closely match the described
semantics.

As a part of this large cleanup, it includes many miscellaneous
improvements:
  * The "bare minimum" debug level introduced in commit dd6a761d was
    split into two levels; -g1 now emits debug information *only* for
    inputs and state wires, and -g2 now emits debug information for
    all public members. The old behavior matches -g2. This is done
    to avoid bloat on low optimization levels.
  * Debug aliases and inlined connections are now handled separately,
    and complex RHS never interferes with inlined connections.
  * Aliases to outlined wires now carry a pointer to the outline.
  * Cell sync outputs can now be emitted in debug_eval().
  * Black box debug information now includes comb/sync driver flags.
  * The comment emitted for inlined cells is now accurate.
  * Debug information statistics now has less noise.
  * Netlist layout code is now much better documented.

Due to more precise inlining decisions, unmodified (i.e. with no
Yosys script being used) netlists now have much more logic inlined
into edge-triggered regions. On Minerva SoC SRAM, this improves
runtime by 20-25% across compilers and optimization levels.

Due to more precise reachability analysis, much less C++ code is now
emitted, especially at the maximum debug level. On Minerva SoC SRAM,
this improves clang compile time by 30-50% depending on options.
gcc is not affected.
2020-12-22 03:48:09 +00:00
StefanBruens 9396678db4
Fix use-after-free in LUT opt pass
RTLIL::Module::remove(Cell* cell) calls `delete cell`.

Any subsequent accesses of `cell` then causes undefined behavior.
2020-12-22 03:23:42 +01:00
whitequark 3e67ab1ebb
Merge pull request #2479 from zachjs/const-arg-hint
Allow constant function calls in constant function arguments
2020-12-22 01:31:25 +00:00
whitequark df905709ca
Merge pull request #2491 from zachjs/port-bind-sign
Sign extend port connections where necessary
2020-12-22 01:30:29 +00:00
Yosys Bot b62a892b2f Bump version 2020-12-22 00:10:05 +00:00
whitequark e825cf9d73 cxxrtl: simplify logic choosing wire type. NFCI. 2020-12-21 07:24:52 +00:00
whitequark 6f42b26cea cxxrtl: clarify node use-def construction. NFCI. 2020-12-21 07:24:52 +00:00
whitequark 406f866659 cxxrtl: fix typo. 2020-12-21 07:24:52 +00:00
Marcelina Kościelnicka f2932628fc xilinx: Add some missing blackbox cells. 2020-12-21 05:34:26 +01:00
Marcelina Kościelnicka 5ffb676fa9 xilinx: Regenerate cells_xtra.v using Vivado 2020.2 2020-12-21 05:34:26 +01:00
whitequark a679a761b5
Merge pull request #2496 from whitequark/cxxrtl-fixes
cxxrtl: various improvements
2020-12-21 04:32:18 +00:00
whitequark b9721bedf0 cxxrtl: speed up bit repeats (sign extends, etc).
On Minerva SoC SRAM, depending on the compiler, this change improves
overall time by 4-7%.
2020-12-21 02:20:34 +00:00
whitequark 40ca9d038b cxxrtl: speed up commits on clang.
On Minerva SoC SRAM compiled with clang-11, this change cuts commit
time in half (!) and overall time by 20%. When compiled with gcc-10,
there is no difference.
2020-12-21 02:20:30 +00:00
whitequark 3d3ea5099d cxxrtl: use `static inline` instead of `inline` in the C API.
In C, non-static inline functions require an implementation elsewhere
(even though the body is right there in the header). It is basically
never desirable to use those as opposed to static inline ones.
2020-12-20 14:48:16 +00:00
Yosys Bot b90d51e35d Bump version 2020-12-20 00:10:10 +00:00
whitequark ab9e2f4fda
Merge pull request #2487 from whitequark/cxxrtl-outlining
CXXRTL: implement zero-cost full coverage debug information through the magic of outlining🪄🎀🧹
2020-12-19 04:14:31 +00:00
Zachary Snow 0d8e5d965f Sign extend port connections where necessary
- Signed cell outputs are sign extended when bound to larger wires
- Signed connections are sign extended when bound to larger cell inputs
- Sign extension is performed in hierarchy and flatten phases
- genrtlil indirects signed constants through signed wires
- Other phases producing RTLIL may need to be updated to preserve
  signedness information
- Resolves #1418
- Resolves #2265
2020-12-18 20:33:14 -07:00
Yosys Bot eaf6b551b6 Bump version 2020-12-18 00:10:05 +00:00
Marcelina Kościelnicka 871fc34ad4 xilinx: Add FDDRCPE and FDDRRSE blackbox cells.
These are necessary primitives for proper DDR support on Virtex 2 and
Spartan 3.
2020-12-17 03:25:07 +01:00
whitequark d889a3df35 cxxrtl: print names of cells inlined in connections. 2020-12-15 11:02:38 +00:00
whitequark f75bc6c7aa cxxrtl: disable optimization of debug_items().
Implementing outlining has greatly increased the amount of debug
information in a typical build, and consequently exposed performance
issues in C++ compilers, which are similar for both GCC and Clang;
the compile time of Minerva SoC SRAM increased almost twofold.

Although one would expect the slowdown to be caused by the increased
use of templates in `debug_eval()`, it is actually almost entirely
attributable to optimizations and codegen for `debug_items()`.

Fortunately, it is neither possible nor desirable to optimize
`debug_items()`: in most cases it is called exactly once, and its
body is a linear sequence of calls with unique arguments.

This commit turns off optimizations for `debug_items()` on GCC and
Clang, improving -Os compile time of Minerva SoC SRAM by ~40% (!)
2020-12-15 11:02:38 +00:00
whitequark 4d40595d64 cxxrtl: make alias analysis outlining-aware.
Before this commit, if a sequence of wires assigned in a chain would
terminate on a cell, none of the wires would get marked as aliases,
and typically all of the public wires would get outlined. The reason
for this behavior is that alias analysis predates outlining and in
fact runs before it.

After this commit, alias analysis runs after outlining and considers
outlined wires valid aliasees. More importantly, if the chained wires
contain any valid aliasees, then all of the wires are aliased to
the one that is topologically deepest.

Aliased wires incur virtually no overhead for the VCD writer, unlike
outlined wires that would otherwise take their place. On Minerva SoC
SRAM, size of the full VCD dump is reduced by ~65%, and throughput
is increased by ~55%.
2020-12-15 11:02:38 +00:00
Yosys Bot 40e35993af Bump version 2020-12-15 00:10:06 +00:00
Marcelina Kościelnicka de99197738 timinginfo: Error instead of segfault on const signals.
Reported by @Ravenslofty
2020-12-15 00:51:16 +01:00
whitequark dd6a761db0 cxxrtl: add a "bare minimum" debug information level.
Useful to reduce overhead when no debug capabilities are necessary
except for access to design state.
2020-12-14 01:27:56 +00:00
whitequark ece25a45d4 cxxrtl: implement debug information outlining.
Aggressive wire localization and inlining is necessary for CXXRTL to
achieve high performance. However, that comes with a cost: reduced
debug information coverage. Previously, as a workaround, the `-Og`
option could have been used to guarantee complete coverage, at a cost
of a significant performance penalty.

This commit introduces debug information outlining. The main eval()
function is compiled with the user-specified optimization settings.
In tandem, an auxiliary debug_eval() function, compiled from the same
netlist, can be used to reconstruct the values of localized/inlined
signals on demand. To the extent that it is possible, debug_eval()
reuses the results of computations performed by eval(), only filling
in the missing values.

Benchmarking a representative design (Minerva SoC SRAM) shows that:
  * Switching from `-O4`/`-Og` to `-O6` reduces runtime by ~40%.
  * Switching from `-g1` to `-g2`, both used with `-O6`, increases
    compile time by ~25%.
  * Although `-g2` increases the resident size of generated modules,
    this has no effect on runtime.

Because the impact of `-g2` is minimal and the benefits of having
unconditional 100% debug information coverage (and the performance
improvement as well) are major, this commit removes `-Og` and changes
the defaults to `-O6 -g2`.

We'll have our cake and eat it too!
2020-12-14 01:27:27 +00:00
whitequark 3b5a1314cd cxxrtl: rename "elision" to "inlining". NFC.
"Elision" in this context is an unusual and not very descriptive term
whereas "inlining" is common and straightforward. Also, introducing
"inlining" makes it easier to introduce its dual under the obvious
name "outlining".
2020-12-13 15:34:00 +00:00
whitequark 57759c3d1f cxxrtl: fix outdated comment. NFC. 2020-12-13 15:33:58 +00:00
whitequark ac1a78923a cxxrtl: use IdString::isPublic(). NFC. 2020-12-13 15:33:55 +00:00
Yosys Bot 5a881497e1 Bump version 2020-12-13 00:10:07 +00:00
whitequark 080f311040 kernel: make IdString::isPublic() const. 2020-12-12 20:50:44 +00:00
whitequark d1b7007e59
Merge pull request #2485 from whitequark/cxxrtl-cell-input-buffering
cxxrtl: don't overwrite buffered inputs
2020-12-12 19:55:57 +00:00
whitequark e4aa8bc96b cxxrtl: don't overwrite buffered inputs.
Before this commit, a cell's input was always assigned like:

    p_cell.p_input = (value...);

If `p_input` is buffered (e.g. if the design is built at -O0), this
is not correct. (In practice, this breaks clocking.) Unfortunately,
the incorrect design was compiled without diagnostics because wire<>
was move-assignable and also implicitly constructible from value<>.

After this commit, cell inputs are no longer incorrectly assumed to
always be unbuffered, and wires are not assignable from values.
2020-12-11 23:32:06 +00:00
Yosys Bot 442d19f647 Bump version 2020-12-10 00:10:10 +00:00
Miodrag Milanović 08510d7248
Merge pull request #2483 from YosysHQ/pmgen_nice_error
Return nice error in pmgen generated code, fixes #2482
2020-12-09 11:19:30 +01:00
Miodrag Milanovic 82dcf78cd9 Return nice error in pmgen generated code, fixes #2482 2020-12-09 11:06:22 +01:00
Yosys Bot c46452221e Bump version 2020-12-09 00:10:04 +00:00
whitequark ec410c9b19
Merge pull request #2478 from whitequark/improve-bugpoint
bugpoint: various improvements
2020-12-08 07:32:11 +00:00
Zachary Snow 186d6df4c3 Allow constant function calls in constant function arguments 2020-12-07 13:53:27 -07:00
David Shah 17812a1560 nexus: Add LRAM inference
Signed-off-by: David Shah <dave@ds0.me>
2020-12-07 13:27:17 +00:00