Ethernet: Media Independent Interface (MII)
With Fast Ethernet, Ethernet introduced something new. Previously, the Physical Signaling layer was integrated into the MAC and connected to the actual media through a Media Attachment Unit. While the user could switch between twisted pair (10Base-T), thinnet (10Base2), thicknet (10Base5), or even fiber (10Base-F) simply by changing MAUs, switching to a different encoding (e.g. Fast Ethernet) would require a completely new interface.
Instead of requiring new networking equipment to manage the PCS of each individual protocol, the MAC communicates with the PHY with a Media Independent Interface (MII). Now, free of protocol-specific encodings, different line rates and protocols could be selected by simply switching between different PHYs, leaving the MAC unaffected. While interchangeable PHYs is now the domain of high-end networking (e.g. SFP modules), the MII interface and its derivatives are the primary mechanism for connecting integrated MACs (and FPGAs) to commodity Ethernet transceivers.
The expression Media Independent Interface can be a little vague. It can either describe the specific interface defined in Clause 22, used for 10 megabit and 100 megabit Ethernet, or serve as a category of all such interfaces. In the later case, 802.3 will abbreviate it as xMII. For this chapter, MII will refer to the specific interface from Clause 22.
When originally proposed, MII was to facilitate the construction of PHYs as a Line Replaceable Unit (LRU). As such, a standardized connector is included in 802.3, covered by Clauses 22.4 (Electrical), 22.5 (Power Supply), and 22.6 (Connector). It is unlikely anyone will make use of this connector. Instead, most modern usages of MII will be chip-to-chip over a single PCB.
Note: The most recent version of the 802 standards are available from the IEEE Get program at no cost. It is highly advised that anyone working with Ethernet download a copy of 802.3 (Wired Ethernet).
Signaling (Clause 22.2)
MII is defined by sixteen individual signals.
Each direction (transmit and receive) has seven signals: a PHY-provided clock (RX_CLK
/TX_CLK
), a valid/enable signal (RX_DV
/TX_EN
), an error signal (RX_ER
/TX_ER
), and a four-bit data bus (RXD
/TXD
).
The last two signals, Carrier Sense (CRS
) and Collision Detected (COL
), are used for half-duplex operation.
The transmit and receive paths are largely symmetric.
RX_DV
/TX_EN
indicate when the medium is transmitting a packet and RX_ER
/TX_ER
indicate the presence of a special condition.
Together, the combination of the valid/enable and error signals control the interpretation of the data bus:
EN | ER | Meaning | RXD /TXD |
---|---|---|---|
0 | 0 | Bus Idle | (Do Not Care) |
0 | 1 | Control Word | Control Word |
1 | 0 | Transmit Data | Data Byte |
1 | 1 | Transmit Error | (Do Not Care) |
Under normal operation, RX_ER
/TX_ER
is held low.
Raising RX_DV
/TX_EN
is done to indicate when a packet is being sent (including during the preamble and SFD).
As discussed previously, Ethernet is primarily a little endian protocol.
Each data byte is sent least significant nibble first with the most significant bit stored in position three and the least significant bit stored in position zero.
For example, the example packet from the introduction (ends with FCS 69 70 39 BB
) would be sent:
SIGNAL | 1 | 2 | … | 13 | 14 | 15 | 16 | … | 141 | 142 | 143 | 144 | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
TX_EN | 0 | 1 | 1 | … | 1 | 1 | 1 | 1 | … | 1 | 1 | 1 | 1 | 0 |
TXD[3:0] | X | 5 | 5 | … | 5 | 5 | 5 | D | … | 9 | 3 | B | B | X |
TXD[3] | X | 0 | 0 | … | 0 | 0 | 0 | 1 | … | 1 | 0 | 1 | 1 | X |
TXD[2] | X | 1 | 1 | … | 1 | 1 | 1 | 1 | … | 0 | 0 | 0 | 0 | X |
TXD[1] | X | 0 | 0 | … | 0 | 0 | 0 | 0 | … | 0 | 1 | 1 | 1 | X |
TXD[0] | X | 1 | 1 | … | 1 | 1 | 1 | 1 | … | 1 | 1 | 1 | 1 | X |
Sent in this order, it is possible to compute the CRC nibble-at-a-time without any buffering.
Even though the standard preamble is 7 bytes of 0x55
followed by the SFD, there is no guarantee you will receive this on MII.
The preamble may be truncated, missing entirely, or possibly a non-integer number of bytes.
It is advised that the MAC accept any number of 0x5
(0101
) prior to the appearance of the SFD, 0xD
(1101
), which establishes the actual byte alignment.
As the interface is nibble-based, it is possible to encode frames with a non-integer number of bytes. Transmitting a trailing half-byte is implementation-defined and the PHY is not required to handle it in any specific manner (Clause 22.2.3.5). On reception, a trailing half-byte is to be truncated and the resulting frame reported as an alignment error if it fails the FCS (Clause 4.2.4.2.1).
In half-duplex operation, it is implementation-defined whether transmit data is looped back into the receive bus unless explicitly enabled through the management interface. For full-duplex operation, this is expressly forbidden unless loopback mode is enabled.
The remaining signals, CRS
and COL
, are used in half-duplex operation.
They are undefined in full-duplex operation and must be ignored by the reconciliation layer.
Many PHYs will signal them identically in both modes.
Clocking (Clause 22.3)
Both clocks are sourced from the PHY and run at 25% of the target bitrate (e.g. 2.5 MHz / 400 ns for 10Base-T, 25 MHz / 40 ns for 100Base-T) with a duty cycle between 35% and 65%.
The transmit clock, TX_CLK
, is always sourced from the local reference oscillator and should have an accuracy of 100 ppm.
The MAC is expected to drive the transmit signals on the rising edge of TX_CLK
.
Clause 22.3.1 specifies a transition window of 0 ns (min) to 25 ns (max) from the rising edge.
As this clocking scheme is system synchronous (from the perspective of the MAC), it is important that the routing of the clock be as short as possible while matching the data lines to each other. An example SDC would be:
# Requirements from 802.3
set txclk_period 40
set txd_hold 0
set txd_valid 25
# Routing latencies of TX_CLK and TXD/TX_EN/TX_ER
# TODO: Update these for your PCB
set txclk_route_max 0
set txclk_route_min 0
set txd_route_max 0
set txd_route_min 0
# Generate constraints
create_clock -name TX_CLK -period $txclk_period [get_ports TX_CLK]
create_clock -name TX_CLK_virt -period $txclk_period
set_output_delay -clock TX_CLK_virt \
-min [expr {$txd_route_min + $txclk_route_min - $txd_hold}] \
[get_ports {TX_EN TX_ER TXD[*]}]
set_output_delay -clock TX_CLK_virt \
-max [expr {$txd_route_max + $txclk_route_max - $txd_valid + $txclk_period}] \
[get_ports {TX_EN TX_ER TXD[*]}]
By contrast, the receive clock, RX_CLK
, may be sourced from the local oscillator or the recovered clock from the peer and may be suppressed under Low Power Idle (LPI).
Due to this, it may shift in frequency, phase, or disappear entirely based upon the link state.
The MAC is expected to capture the signals on the rising edge of RX_CLK
.
Clause 22.3.2 specifies setup and hold times of 10 ns from the rising edge.
As this clocking scheme is source synchronous, the data lines should be length matched to the clock. An example SDC would be:
# Requirements from 802.3
set rxclk_period 40
set rxd_hold 10
set rxd_setup 10
# Generate constraints
create_clock -name RX_CLK -period $rxclk_period [get_ports RX_CLK]
create_clock -name RX_CLK_virt -period $rxclk_period
set_input_delay -clock RX_CLK_virt -min $rxd_hold \
[get_ports {RX_DV RX_ER RXD[*]}]
set_input_delay -clock RX_CLK_virt \
-max [expr {$rxclk_period - $rxd_setup}] \
[get_ports {RX_DV RX_ER RXD[*]}]
The remaining two signals, COL
and CRS
, are asynchronous to both clocks so users will need to explicitly synchronize them (generally to the transmit clock).
Data Errors (Clause 22.2.2.5, 22.2.2.10)
Encoding errors can be indicated by the use of RX_ER
or TX_ER
.
When asserted during a packet (RX_DV
/TX_EN
are high), this indicates that an issue with that specific byte position.
For example, on reception, the PHY detected a coding error and indicates the byte is invalid.
Alternatively, on transmission, the MAC suffered a buffer underflow and needs to spoil the packet to ensure its peer will not mistakenly interpret it as a valid packet.
The specific value of the data bus is undefined during a data error.
Note: These signals are often absent on the MACs of low-end microcontrollers.
If not used, RX_ER
is left floating and TX_ER
is tied to ground.
Special Conditions (Clause 22.2.2.4, 22.2.2.8)
Outside of a packet (RX_DV
/TX_EN
is low), the error signal (RX_ER
/TX_ER
) is used to indicate the presence of special conditions on the medium.
Most of these will be discussed in the following sections.
Value | Transmit | Receive |
---|---|---|
0000 | Reserved | Normal Interframe |
0001 | Assert LPI | Assert LPI |
0010 | PLCA BEACON | PLCA BEACON |
0011 | PLCA COMMIT | PLCA COMMIT |
1110 | Reserved | False Carrier |
False Carrier (1110
) typically indicates a coding error on the part of the remote peer (Clause 24.2.4.4.2).
This can generally be ignored except for logging purposes but may indicate hardware malfunction.
Link Configuration
The fundamental properties when configuring the interface, be it manually or through autonegotiation, are the following:
- Link Speed. As all clocks are provided by the PHY, it is rarely necessary for the MAC to know which speed has been negotiated.
- Link Duplex. Half-duplex requires the implementation of CSMA/CD on the part of the MAC.
- Energy Efficient Ethernet.
EEE will result in the potential generation of LPI sequences by the peer, which may result in
RX_CLK
being halted. It also allows the local MAC to generate LPI sequences of its own to reduce power consumption.
These properties are not available in-band and need to be accessed through the management interface.
Crossover
Crossover in this context refers to connecting two devices of the same class (PHY or MAC) directly. For example, connecting the TX of one MAC directly to the RX of a second MAC without an intervening PHY, or using a pair of PHYs as a media converter.
Unlike later MII variants, all clocks are provided by the PHY. This means that PHY-to-PHY will have conflicting clocks while MAC-to-MAC will have no clock. While external clocks can be provided in the later case, the setup and hold requirements would likely not be met in a naïve configuration.
Assuming one peer in this arrangement cannot be configured to switch interface direction through configuration, there are limited options.
PHY-to-PHY will require active logic to perform clock domain crossing, including an elastic buffer to address potential clock skew.
MAC-to-MAC can be connected directly, so long as an external clock is provided and opposite polarity is connected to the TX_CLK
and RX_CLK
inputs to mitigate differences in timing.
Energy Efficient Ethernet (Clause 22.7, 78)
Under 100Base-TX and later, the transmitter runs continuously in full duplex operation, even when no packet is being transmitted. Energy Efficient Ethernet (EEE) is a mechanism by which the transmitter can be disabled during periods of extended inactivity, reducing power consumption. To maintain the link, the PHY will periodically enable its transmitter to send a refresh signal.
First, EEE support needs to be negotiated with the peer. On some PHYs, this is enabled by default. On others, some form of configuration is required. This is normally handled through the Clause 45 register sets MMD3 and MMD7.
When the peer signals it is entering Low Power Idle (LPI), the local PHY will report this to the MAC by signalling Assert LPI, holding RX_DV
low, RX_ER
high, and 0001
on RXD
.
Once the PHY has indicated this condition for at least nine clock cycles, it may halt RX_CLK
until the peer leaves LPI.
When the local MAC is idle, it may request the PHY enter LPI in a similar manner:
It pulls TX_EN
low, TX_ER
high, and loads 0001
onto TXD
.
However, when it wishes to resume transmission, it cannot do so immediately upon releasing Assert LPI.
The peer needs time to synchronize its clock recovery and descrambler.
The minimum required time is specified in Clause 78 as Tw_sys_tx, which is provided for each PHY in Table 78-4.
For the PHYs covered by Clause 22 MII, these times are:
- For 10Base-T1L, this is 270 μs (675 clock cycles).
- For 100Base-TX, this is 30 μs (750 clock cycles).
Failure to meet these timing requirements may result in data loss as the peer may not have been able to complete synchronization.
Many embedded MACs do not provide TX_ER
, let alone implement LPI.
As a result, many PHYs will allow software to manually enter LPI through a management action.
10Base-T does not transmit continuously so does not need to implement LPI. Instead, EEE introduced 10Base-Te, which reduces the transmit amplitude but is otherwise identical. As it can freely interop with traditional 10Base-T, it does not need to be negotiated and many PHYs will enable it by default.
Note: I do not have personal experience with EEE. This section is simply a summarization of the standard.
Half-Duplex (Clause 4.2.3.2, 22.2.2)
Half-duplex is largely extinct for desktop usage with most protocols dedicated to shared media having been expired. It may still appear when autonegotiation is disabled or incorrectly configured. Failure to properly implementing half-duplex under these conditions may lead to significant difficulty transmitting data across the link when one peer continuously falls back into collision recovery.
However, half-duplex is an essential component of 10Base-T1S (Clause 147). This is a special-purpose Ethernet interface intended to replace CAN, using a shared media and an optional scheduling scheme to provide a similar system of prioritization. However, even when PLCA is in use, failure of node zero will result in the network reverting to classic half-duplex operation.
The two asynchronous signals, CRS
and COL
, are for the express purpose of managing half-duplex communication:
CRS
, Carrier Sense, will be asserted when the medium is non-idle (i.e. someone is transmitting, possibly ourselves). This is a distinct signal fromRX_DV
owing to latency within the PHY.COL
, Collision Detected, will be asserted when the PHY detects a collision. This need not necessarily be a collision caused by our own device.
Medium access is described by Clause 4.2.3.2 and has three core phases:
- Deference (Clause 4.2.3.2.1)
- Collision Detection (Clause 4.2.3.2.4)
- Backoff and Retransmission (Clause 4.2.3.2.5)
This algorithm is known as Carrier Sense Multiple Access with Collision Detection (CSMA/CD).
Deference means that transmission will not begin so long as the medium is active.
Instead, the MAC will monitor the CRS
signal.
While this signal is asserted, the MAC will defer its own transmission until CRS
is deasserted plus an additional number of TX_CLK
periods equal to the InterPacket Gap (IPG) of 96 bits (24 clock cycles).
Once this period has elapsed, and a packet is ready for transmission, the MAC will begin transmitting the packet even if the carrier sense has been reasserted.
This later requirement is to ensure fair access to the media.
Once the transmission has commenced, the MAC will monitor the COL
signal for the duration of the transmission.
Should this signal be asserted, the PHY has detected a collision and the MAC must terminate the packet after transmitting an additional 32 bits (8 clock cycles) to ensure the collision is detected by all parties.
The content of this jam pattern is not important, so long as it does not (intentionally) hold the FCS for a valid packet.
A good recommendation is to emit an intentionally spoiled FCS.
Upon a collision, the MAC is permitted to make up to sixteen attempts to transmit the packet (including the first). However, instead of transmitting the packet immediately, it will enter into a random delay. The bounds of this delay expand exponentially with each retransmission attempt according to the formula:
Where r is the number of slot times and n is the retransmission attempt (presumably one-based but Clause 4.2.3.2.5 isn’t clear). As the upper bound is a power of two, this makes it straightforward to generate in hardware by simply masking off the appropriate number of bits from a random generator, such as a Linear Feedback Shift Register (LFSR) formed from the FCS generation logic. The slotTime for 10 and 100 megabit is 512 bits (128 clocks).
Per Clause 4.2.3.2.2, there is a distinction between collisions that occur within the first slot time and those that come after. Late collisions indicate a network configuration error (e.g. someone’s not engaging in CSMA/CD) and are reported in a separate category.
Note: I do not have personal experience with half-duplex operation. This section is simply a summarization of the standard.
Physical Layer Collision Avoidance (Clause 148)
As mentioned in the previous section, 10Base-T1S (Clause 147) has an optional mode to permit a CAN-like prioritization scheme on the shared media. This scheme is in addition to CSMA/CD. Non-participating hosts can share the mixing domain with PLCA hosts and PLCA will reduce to CSMA/CD in the case of failure. In effect, PLCA is a modified scheme for transmit deference.
The basic concept:
- When a node comes out of reset, there is an initial wait to synchronize with the mixing domain.
- The highest priority node will send out periodic BEACONs (MII command
0010
). - Following the beacon are a number of equally sized time slots, each corresponding to for a transmit opportunity of decreasing priority.
- A host participating in PLCA can either use its transmit opportunity by sending out a packet or COMMIT (MII command
0011
), or yield the slot by doing nothing. - If the beacon is missing, hosts revert to traditional CSMA/CD behavior.
The actual state machines in Clause 148 are fairly complex because they serve to describe an implementation that lives entirely within the MII reconciliation layer. Permitting some portion of it to live within the MAC would greatly simplify the implementation and improve reliability.
When constructing the mixing domain, a few things need to be established. These are variables used within the controlling framework and aren’t directly exposed on the network.
- The maximum identifier for the mixing domain (aPLCANodeCount). This is used by node zero to determine the period between BEACONs. The default value (Clause 30.16.1.1.3) is eight.
- The unique identifier for each node (aPLCALocalNodeID), between 0 and 255 (inclusive). Higher numbered identifiers are of lower priority with node zero being the one responsible for sending BEACONs. The default value (Clause 30.16.1.1.4) is 255.
- The length of the window for each transmit opportunity (aPLCATransmitOpportunityTimer). This is an integer between 1 and 255 bit times (inclusive) and needs to be able to absorb the propagation times of the medium and latency of participating PHYs. The default value (Clause 30.16.1.1.5) is 32 bit times (3.2 μs, 8 MII clock cycles). It is unclear how to manage a value that is not a multiple of four.
- The maximum number of additional packets a node can send in a burst (aPLCAMaxBurstCount). This is an integer between 0 and 255 (inclusive). The default value (Clause 30.16.1.1.6) is zero.
- The maximum time between packets in a burst (aPLCABurstTimer). This is an integer between 0 and 255 bit times (inclusive) The default value (Clause 30.16.1.1.7) is 128 bit times. It is unclear how to manage a value that is not a multiple of four.
The control node, node zero has the following scheduling behavior:
- On reset, it will wait one transmit opportunity and, assuming the medium is idle, send out a BEACON.
- Note: There is a discrepancy between the text and the state diagram.
The text in Clause 148.4.4.1 states to wait one transmit opportunity while following the state diagram in Clause 148.4.4.6 would have node zero wait for aPLCANodeCount periods.
This is because the entry point,
DISABLE
, initializes curID to zero andRECOVER
does not change this value before launching it into the core of the state machine.
- Note: There is a discrepancy between the text and the state diagram.
The text in Clause 148.4.4.1 states to wait one transmit opportunity while following the state diagram in Clause 148.4.4.6 would have node zero wait for aPLCANodeCount periods.
This is because the entry point,
- The BEACON is sent for 20 bit periods (5 MII clock cycles). This synchronizes the mixing domain’s schedule and marks the beginning of the transmit opportunity for node zero (itself).
- Wait for the BEACON condition to clear.
- Wait for one of three actions:
- If the medium goes active without COMMIT or a packet, we return to step (1) as this indicates a possible loss of synchronization.
- If the medium goes active with COMMIT or a packet, we increment the node index once the medium is released.
- After a period of aPLCATransmitOpportunityTimer elapses, the current node index is incremented. If new node index is equal to or greater than the node count (aPLCANodeCount), return to step (2) and send another BEACON.
The remaining nodes have the following scheduling behavior:
- On reset, the node will assume PLCA is not active until it sees a BEACON, defaulting to CSMA/CD.
- Upon receiving a BEACON, the node will enter PLCA mode and (re)start a 4 μs timer,
invalid_beacon_timer
. This establishes the transmit opportunity for node zero. - We wait for one of five actions:
- If we receive a BEACON, we return to step (2), resetting the timer and node index.
- If the medium goes active without receiving a COMMIT, BEACON, or packet, we return to step (1) as this indicates a possible loss of synchronization.
- If the medium goes active with COMMIT or a packet, we increment the node index once the medium is released.
- After a period of aPLCATransmitOpportunityTimer elapses, the current node index is incremented.
- The timer we set in step (2),
invalid_beacon_timer
, elapses, placing PLCA into an inactive state.
- Once PLCA has been inactive for at least 13 ms (technically, 130,090 bit times), it is considered to have failed and the mixing segment reverts to legacy CSMA/CD operation. This constitutes the purpose of the state machine described in Clause 148.4.6.
When using one’s transmit opportunity, the following logic is used:
- COMMIT replaces idle to hold the transmit opportunity.
- Additional packets can be sent, up to a limit of aPLCAMaxBurstCount, with a maximum of aPLCABurstTimer between them.
- If the medium is active during our transmit opportunity, this is considered a collision and voids this opportunity. Transmission is delayed until our next transmit opportunity.
The state diagram in Clause 148.4.5.7 includes an elastic buffer to permit use of a MAC not aware of PLCA. It makes extensive use of induced collision indications to force retransmissions when buffer capacity is exhausted or the transmit opportunity is violated.
Note: I do not have personal experience with 10Base-T1S. This section is simply a summarization of the standard.