Hierarchical design simulations are generally much slower, but this
comes with a major increase in flexibility:
1. Since the `flatten` pass currently does not support flattening
of designs with processes, this is the only way to simulate such
designs with cxxrtl.
2. Support for hierarchy paves way for simulation black boxes,
which are necessary for e.g. replacing PHYs with C++ code that
integrates with the host system.
After this commit, if NDEBUG is not defined, out-of-bounds accesses
cause assertion failures for reads and writes. If NDEBUG is defined,
out-of-bounds reads return zeroes, and out-of-bounds writes are
ignored.
This commit also adds support for memories that start with a non-zero
index (`Memory::start_offset` in RTLIL).
This results in further massive gains in performance, modest decrease
in compile time, and, for designs without feedback arcs, makes it
possible to run eval() once per clock edge in certain conditions.