This adds the transmit half of a MAC, supporting 100M and half-duplex.
It's roughly analogous to the axis_(x)gmii_tx modules in Alex
Forencich's ethernet repo. I've taken the approach of moving all state
into the state variable. All decisions are made once and have a
different state for each path. For example, instead of checking against
a "bytes_sent" variable to determine what to do on collision, we have a
different state for each set of actions.
This whole module is heinously complex, especially because of the many
corner cases caused by the spec. I have probably not tested it nearly
enough, but the basics of sending packets have mostly had the bugs wrung
out.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Although the least-significant bit of sd_delay is driven by an SB_IO (if
we are synthesizing), the other bits need to be initialized.
Fixes: d8ce165 ("pmd: Delay signal_status/detect until data is valid")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The condition for determining s_axis_ready only looks at whether we are
currently full, not whether we will be full on the next cycle (which is
what matters). Make it take into account whether we are going to
increment s_ptr during the current cycle. Also increase the ratio to
ensure we trigger this case, as a ration of 2 doesn't make the slave
slow enough to catch this.
Fixes: 52325f2 ("Add AXI stream replay buffer")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
If the lower bits of m_ptr is the same as s_ptr when we get s_axis_last
(that is, we are full), then we will immediately end (since m_ptr will
equal last_ptr). Fix this by including all of s_ptr in last_ptr.
Fixes: 52325f2 ("Add AXI stream replay buffer")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds the integrated PMD module to be used with the DP83223. It
contains NRZI en/decoding as well as the I/O interfaces. The rx I/O was
added a while back, and the tx is just the I/O cell.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Most other modules in the recieve path are reset when signal_status goes
low. This one didn't, and it was difficult to test because of this. In
particular, we need to ensure that this module behaves correctly when
switching between different bitstreams (such as for loopback).
While implementing this, I found some bugs in the way that nrzi_valid
was handled: it wasn't saved from when the transfer actually happened,
and the data wasn't qualified properly.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This implements an AXI stream buffer which allows replaying of the first
portion of each packet. The intent is to simplify the implementation of
CSMA/CD. This requires keeping 56 bytes of data to "replay" (slot time
minus the preamble). After these bytes are transmitted, we can only get
late collisions.
We always read from the buffer, as this simplifies the implementation
compared to some kind of hybrid fifo/skid buffer approach. The primary
design problem faced is in determining when it's OK to overwrite the
first byte in the packet. A naïve approach might be to allow overwriting
whenever the slave reads the last byte. However, in the case of a
54-byte packet, we will still need to allow replaying at this point (in
case there is a collision on the last byte). We can't just wait for
m_axis_ready to go high, because that would violate the AXI stream
protocol. To solve this, the slave must assert the done signal when it
is finished with the packet.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This module integrates the PCS with the descrambler, implements the PMA
(which is just the link monitor), and implements loopback and coltest
functions. This is more of the PCS/PMA, but the descrambler is
technically part of the PMD, so it's the "core" instead.
We deviate from the standard in one important way: the link doesn't come
up until the descambler is locked. I think this makes sense, since if
the descrambler isn't locked, then the incoming data will be gibberish.
I suspect this isn't part of the standard because the descrambler
doesn't have a locked output in X3.263, so IEEE would have had to
specify it.
Loopback is actually implemented in the PMD, but it modifies the
behavior in several places. It disables collisions (unless
coltest is enabled). Additionally, we need to force the link up (to
avoid the lengthy stabilization timer), but ensure it is down for at
least once cycle (to ensure the descrambler desynchronizes).
On the test side, we just go through the "happy path," as many of the
edge conditions are tested for in the submodule tests. Several of those
tests are modified so that their helper functions can be reused in this
test. In particular, the rx path is now async so that we can feed it
rx_data.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds an explicit false carrier signal. Trying to determine this
condition the MDIO signals is tricky because BAD_SSD can last over
several cycles of CE. To make things easier, add a signal which is high
only once per event.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Even when signal_status is low, we should still generate data. This
keeps the PCS processes moving along.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds support for counters for interesting conditions (disconnects,
PMD phase wraparounds, errors, etc). All the counters are 15 bits
instead of 16, because 16-bit counters have an fmax of 110MHz or so. All
the counters live behind a condition because they ~double the number of
resources used.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
When increasing the delay for the recieved data, I forgot to increase
the delay for the signal status as well. Fix this.
Fixes: c02d3f3 ("pmd_io: Calculate wraparound based on state and not state_next")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This better reflects that this is an interface intended to be used with
the DP83223. While we're at it, refactor the module to just handle the
the recieve portion.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
For easier integration, split the PCS into its rx and tx components.
This was already done on the module level, but now they live in separate
files.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Basing the wraparound calculation on state_next is causing problems for
timing. Do it with the state calculations instead. We need to add
another cycle of latency to make this work, since we can't output the
data correctly without knowing the wraparound.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
These inputs were incorrectly marked as LVDS, which was causing problems
for placement. Make them single-ended.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
When writing this initially, I tried to remove some duplicate
conditionals by working with idle_counter_next. However, yosys isn't
smart enough to rewrite the calculation in terms of idle_counter, so do
it ourselves. This breaks up the critical path.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The critical path often includes the unlock timer. Switch to an lfsr
implementation. This saves around 20 LUTs and reduces the critical path
from the carry chain (and the or reduction) to just the and reduction.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
While manually dumping signals with a macro works OK for standalone
modules, it doesn't work when multiple modules are included. Instead,
create a second top-level module to dump signals. Inspired (once again)
by [1].
[1] https://github.com/steveicarus/iverilog/issues/376#issuecomment-709907692
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This aligns the signal naming with what is used by other modules (IEEE
names for external signals, and something else for internal).
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Yosys is very verbose, so I usually run it quietly. However, it may be
usefult to review synthesis logs when debugging.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This isn't really useful for most modules (since the placement info is
if they were the only thing instantiated), but it should be a good base.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
As it turns out,
reg foo = 0;
is not the same as
reg foo; initial foo = 0;
but instead is equivalent to
reg foo; always @(*) foo = 0;
This is rather silly. Convert all existing (lucky) examples to the
second form.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The singal-ended to differential conversion will be done by the
transceiver (by the ECL interface circuit).
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This (un)locking logic was on the critical path. Break it up into
multiple parts to allow achieving our desired clock frequency.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds a module implementing the the MII management functions (the
MDIO regs). For the moment, we just implement the standard registers.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The 802.3.22.2.4.3 requires that the phy not respond to reads of and
ignore writes to unimplemented extended registers. When writing the mdio
module, I expected that such read/writes would not be acked by the
registers. However, that behavior is not especially nice for wishbone
masters which don't expect it. Instead, allow the slave to return an
error instead. We need an extra saved_err variable, since we might not
be able to set bad immediately (when ce is low).
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The specification requires that the MII be isolated before the STA
clears the BMCR.ISOLATE bit. Add support for this to the MII I/O
modules.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This better reflects the function of the module (interfacing the
transciever via the I/O pins), and fits better with the naming scheme
used for other I/O modules.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The actitecture is overall fairly similar to the receive interface,
except that the directions are mostly different. The timing is a bit
easier, since we control the ce signal. Data is sampled one clock before
tx_clk goes high, which is the earliest that it is guarantee'd to be
valid. We could get an extra half-clock by having tx_clk go high at the
negedge of clk, but it's unnecessary at the moment.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This generates the appropriate output for MII receive signals. Because
we don't have a clock synchronous to the recieved data, we may
occasionally have some cycles which are 32 ns or 48 ns long (instead of
the nominal 40 ns). This distorts the duty cycle to 38% or 58%,
respectively, which is within the specified 35% to 65%. This does change
the frequency to either 31 MHz or 21 MHz, respectively, which *is* a
violation of the spec. This could be avoided by introducing a FIFO to
smooth out any variations in jitter, like what RMII does.
The generation of rx_clk is a bit tricky. We can use a combinatorial
signal for the posedge, since that is what the rest of the logic is
referenced to, However, we need to register the negedge to prevent an
early (or late) ce from modifying the duty cycle.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This module implements the I/O portion of the MII management interface.
The output is delayed by 2 clocks in order to ensure that the external
level shifter has switched directions before we drive it. The latency
increase (around 16 ns) is not consequential, since we have around 300
ns from the rising edge of MDC before MDIO has to be valid.
On the other end, the timing requirements for MDIO driven by the STA are
very lenient (for them); MDIO only has to be valid for 10 ns on either
side of the rising edge of MDC. This effectively means we must sample
MDIO synchronously to MDC (not easy with nextpnr), or oversample by 50x.
Fortunately, we have a 125 MHz clock which the rest of the phty runs off
of. However, this basically makes 10x oversampling with the MII clock
impossible.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This module implements the MII management interface ("MDIO"), and
translates frames into classic wishbone reads/writes. We use a
"state_counter" to keep track of how many additional bits we expect to
recieve before continuing on to the next field in the frame. We require
a preamble because it prevents ambiguity, and omitting it doesn't seem
to be very popular (seeing as it was removed for c45). Generally, even
if we find an error in the frame, we still procede through the states as
usual. This prevents any spurious reads/writes caused by misinterpreting
an unaligned data stream.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
This adds support for (de)scrambling as described in X3.263. The
scrambler is fairly straightforward. Because we only have to recognize
idles, and because the timing constraints are more relaxed (than e.g.
the PCS), we can make several simplifications not found in other
designs (e.g. X3.263 Annex G or DP83222).
First, we can reuse the same register for the lfsr as for the input
ciphertext. This is because we only need to record the scrambled data
when we are unlocked, and we can easily recover the unscrambled data
just by an inversion (as opposed to needing to align with /H/ etc).
Second, it is not critical what the exact thresholds are for locking an
unlocking, as long as certain minimums are met. This allows us to ignore
edge cases, such as if we have data=10 and valid=2. Without these
relaxed constraints, we would need to special-case this input to ensure
we didn't miss the last necessary consecutive idle. But instead we just
set the threshold such that one missed bit does not matter.
To support easier testing, a test input may be used to cause the
descramble to become unlocked after only 5us, instead of the mandated
361. This makes simulation go much faster.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
24.6.1 requires that CRS goes high fewer than 4 cycles after TX_EN goes
high. This means we need to assert tx when we enter then START_J state,
not when we actually transmit a /J/. This also has the upside of
simplifying the logic a bit.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
We will need this in every verilog file, so consolidate things a bit. In
terms of timescale, we need to modify the post-synthesis verilog
generation a bit in order to avoid the module's timecale being
inadverdently overwritten.
Signed-off-by: Sean Anderson <seanga2@gmail.com>
The PCS state machine is evaluated every cycle, but its outputs are only
registers when the rx_bits module indicates. However, the flush signal
is not registered and is instead combinatorial. Although it's OK to
evaluate the other outputs every cycle, we should only indicate if we
are actually going to change state.
Fixes: d351291 ("Initial commit")
Signed-off-by: Sean Anderson <seanga2@gmail.com>