AXI stream is transferred exactly on the rising edge of the clock. Use
the current value of the signals for this, instead of past values.
Simulate a slower slave to ensure this is tested.
Fixes: a549fca ("Add UART receive module")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The AXI stream master doesn't cope with slaves that aren't ready all the
time. There are two separate issues: first, the data was only valid for one
cycle. Second, the handshake logic was incorrect. Rectify these, and modify
the testbench to test for this condition.
Fixes: 7514231 ("Add AXIS-Wishbone bridge")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This was defined but left unused. Use it for the width of various
registers.
Fixes: 7514231 ("Add AXIS-Wishbone bridge")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds the core of the UART-Wishbone bridge. The protocol has
a variable-length address phase to help reduce overhead. Multiple
in-flight commands are not supported, although this could be resolved
with some FIFOs.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Add the recieve half of the UART. It's more or less the inverse of the
transmit half, except we manage the state explicitly. I originally did
this in hopes that yosys would recode the FSM, but it doesn't like the
subtraction in the D* states. I left in the async reset anyway since it
reduces the LUT count.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
I join everyone and their mother in creating my own UART. 8n1 only, and 2
baud rates. Accepts AXI-stream.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds an example of how to integrate the hub into a design. For the
moment, wishbone is disabled, but I plan to add a uart bridge in the
future.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Export some status signals which can be used for LEDs. Hopefully this
will deliver an authentic blinkenlights experience.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds a basic hub wrapper module which incorperates the core
introduced in b68e131 ("Add a basic hub"). For each port, it
instantiates a phy (itself using a phy_internal wrapper) and an elastic
buffer. A WISHBONE parameter is used to control whether to instantiate a
wishbone interface. When disabled, we just respond to any request with
err. I've ommitted a separate testbench for phy_internal, since it is
much easier to create a smoke test using the hub interface.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The flip-flops internal to the SB_IO can't have initial values and
can't be reset. So before the first clock the data out will be X. This
results in a simulation-synthesis mismatch, as sd_delay will be wrong
for one clock cycle. Fix this by removing the SB_IO cell, as the timing
of this signal isn't critical.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds a simple wishbone mux. The idea is that each slave gets its
own address bit. This lends itself to extemely simple address decoding,
but uses up address space quickly. In theory, we could also give larger
addres space to some slaves, but currently lower bits have priority. The
testbench is also very simple. Since everything is combinatorial, we can
determine the outputs from the inputs exactly.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
There are several places where memories are used for parametrization
purposes, but I intend them to be synthesized to registers. Silence
warnings about them by explicitly annotating these variables.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This module will make it easier to observe internal signals which would
otherwise be too short to see, or would trigger too fast to distinguish.
Continuous triggered will cause blinking, so signals which are expected
to be high for a while (e.g. level-based and not edge-based) should not
use this module.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
In order to move data between MIIs without implementing a MAC, we need
some kind of elastic buffer to bring the data into the transmit "clock
(enable) domain." Implement one. It's based on a classic shift-register
FIFO, with the main difference being the MII interfaces and the
elasticity (achieved by delaying asserting RX_DV until we reach the
WATERMARK). We use a register-based buffer because we only need to deal
with an under-/over-flow of 5 or so clocks for a 2000-byte packet. The
per-stage resource increase works out to 6 FFs and 1 LUT, which is
pretty much optimal.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds a basic clause 27 repeater (hub), mostly for test purposes.
It's effectively just the state machine in figure 27-4 and nothing else
(e.g. no partitioning or jabber detection). This is surprisingly simple.
Unfortunately, yosys doesn't allow memories in port declarations, even
for systemverilog. This complicates the implementation and testbench,
since we have to do the slicing ourselves. This is particularly awful
for the testbench, since
module.signal[0].value != module.signal.value[0]
and module.signal can't be indexed by slices, and module.signal.value is
big endian (ugh ugh ugh). There is no clean solution here.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds support for half-duplex. This is mostly done by predicating
col and crs on half_duplex. In one place we need to go to IPG_LATE
directly (although we could go to IPG_LATE like FCS with no loss of
standard compliance).
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Including registers which are not reset in an asynchronous reset process
causes active-low clock-enable flip-flops to be synthesized. This is an
unusual configuration, incurs overhead, and isn't what we wanted to do
anyway. Use a separate process.
While we're at it, sort the bottom half of the if to match the top.
Fixes: 19f2f65 ("axis_mii_tx: Add reset")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The 2 ns delay when reading from a BRAM makes it hard to close timing,
since buf_err affects the state machine. Address this by not acting on
errors for a clock cycle. We will output bad data for a cycle, but we
are going to corrupt the FCS anyway so it doesn't matter. We also have
to check for errors in the PAD/FCS states, to ensure they don't slip
past.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
We only care about backoff when state=BACKOFF. We can simplify the
calculation by defaulting to loading lfsr into backoff, and special
casing things for state=BACKOFF.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds the transmit half of a MAC, supporting 100M and half-duplex.
It's roughly analogous to the axis_(x)gmii_tx modules in Alex
Forencich's ethernet repo. I've taken the approach of moving all state
into the state variable. All decisions are made once and have a
different state for each path. For example, instead of checking against
a "bytes_sent" variable to determine what to do on collision, we have a
different state for each set of actions.
This whole module is heinously complex, especially because of the many
corner cases caused by the spec. I have probably not tested it nearly
enough, but the basics of sending packets have mostly had the bugs wrung
out.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Although the least-significant bit of sd_delay is driven by an SB_IO (if
we are synthesizing), the other bits need to be initialized.
Fixes: d8ce165 ("pmd: Delay signal_status/detect until data is valid")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The condition for determining s_axis_ready only looks at whether we are
currently full, not whether we will be full on the next cycle (which is
what matters). Make it take into account whether we are going to
increment s_ptr during the current cycle. Also increase the ratio to
ensure we trigger this case, as a ration of 2 doesn't make the slave
slow enough to catch this.
Fixes: 52325f2 ("Add AXI stream replay buffer")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
If the lower bits of m_ptr is the same as s_ptr when we get s_axis_last
(that is, we are full), then we will immediately end (since m_ptr will
equal last_ptr). Fix this by including all of s_ptr in last_ptr.
Fixes: 52325f2 ("Add AXI stream replay buffer")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds the integrated PMD module to be used with the DP83223. It
contains NRZI en/decoding as well as the I/O interfaces. The rx I/O was
added a while back, and the tx is just the I/O cell.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Most other modules in the recieve path are reset when signal_status goes
low. This one didn't, and it was difficult to test because of this. In
particular, we need to ensure that this module behaves correctly when
switching between different bitstreams (such as for loopback).
While implementing this, I found some bugs in the way that nrzi_valid
was handled: it wasn't saved from when the transfer actually happened,
and the data wasn't qualified properly.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This implements an AXI stream buffer which allows replaying of the first
portion of each packet. The intent is to simplify the implementation of
CSMA/CD. This requires keeping 56 bytes of data to "replay" (slot time
minus the preamble). After these bytes are transmitted, we can only get
late collisions.
We always read from the buffer, as this simplifies the implementation
compared to some kind of hybrid fifo/skid buffer approach. The primary
design problem faced is in determining when it's OK to overwrite the
first byte in the packet. A naïve approach might be to allow overwriting
whenever the slave reads the last byte. However, in the case of a
54-byte packet, we will still need to allow replaying at this point (in
case there is a collision on the last byte). We can't just wait for
m_axis_ready to go high, because that would violate the AXI stream
protocol. To solve this, the slave must assert the done signal when it
is finished with the packet.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This module integrates the PCS with the descrambler, implements the PMA
(which is just the link monitor), and implements loopback and coltest
functions. This is more of the PCS/PMA, but the descrambler is
technically part of the PMD, so it's the "core" instead.
We deviate from the standard in one important way: the link doesn't come
up until the descambler is locked. I think this makes sense, since if
the descrambler isn't locked, then the incoming data will be gibberish.
I suspect this isn't part of the standard because the descrambler
doesn't have a locked output in X3.263, so IEEE would have had to
specify it.
Loopback is actually implemented in the PMD, but it modifies the
behavior in several places. It disables collisions (unless
coltest is enabled). Additionally, we need to force the link up (to
avoid the lengthy stabilization timer), but ensure it is down for at
least once cycle (to ensure the descrambler desynchronizes).
On the test side, we just go through the "happy path," as many of the
edge conditions are tested for in the submodule tests. Several of those
tests are modified so that their helper functions can be reused in this
test. In particular, the rx path is now async so that we can feed it
rx_data.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds an explicit false carrier signal. Trying to determine this
condition the MDIO signals is tricky because BAD_SSD can last over
several cycles of CE. To make things easier, add a signal which is high
only once per event.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Even when signal_status is low, we should still generate data. This
keeps the PCS processes moving along.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds support for counters for interesting conditions (disconnects,
PMD phase wraparounds, errors, etc). All the counters are 15 bits
instead of 16, because 16-bit counters have an fmax of 110MHz or so. All
the counters live behind a condition because they ~double the number of
resources used.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
When increasing the delay for the recieved data, I forgot to increase
the delay for the signal status as well. Fix this.
Fixes: c02d3f3 ("pmd_io: Calculate wraparound based on state and not state_next")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This better reflects that this is an interface intended to be used with
the DP83223. While we're at it, refactor the module to just handle the
the recieve portion.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
For easier integration, split the PCS into its rx and tx components.
This was already done on the module level, but now they live in separate
files.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Basing the wraparound calculation on state_next is causing problems for
timing. Do it with the state calculations instead. We need to add
another cycle of latency to make this work, since we can't output the
data correctly without knowing the wraparound.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
These inputs were incorrectly marked as LVDS, which was causing problems
for placement. Make them single-ended.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
When writing this initially, I tried to remove some duplicate
conditionals by working with idle_counter_next. However, yosys isn't
smart enough to rewrite the calculation in terms of idle_counter, so do
it ourselves. This breaks up the critical path.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The critical path often includes the unlock timer. Switch to an lfsr
implementation. This saves around 20 LUTs and reduces the critical path
from the carry chain (and the or reduction) to just the and reduction.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
While manually dumping signals with a macro works OK for standalone
modules, it doesn't work when multiple modules are included. Instead,
create a second top-level module to dump signals. Inspired (once again)
by [1].
[1] https://github.com/steveicarus/iverilog/issues/376#issuecomment-709907692
Signed-off-by: Sean Anderson <seanga2@gmail.com>