Notes on USB 1.1 low-level protocol for FPGA implementation

This post was written by eli on August 19, 2017
Posted Under: FPGA,USB

Introduction

These are the consideration and design decisions I took when designing a transparent hub for low- and full-speed USB (that is, all covered by USB 1.1, and not high-speed as required by USB 2.0).

A transparent 1:1 hub is a device with one male USB plug, and one female plug. It basically substitutes an extension cable, and should not be visible or have any practical influence on the connected parties. The idea is to repeat the signals from one end to another by virtue of two ULPI PHY frontends, one connected to each side, and both connected to an FPGA in the middle.

This is not the recommended practice if all you want is to sniff the data: In that case, just connecting the D+/D- wires directly to a couple of 3.3V-compatible inputs of the FPGA in parallel to the cable is the way to go. Literally tapping the wire (you might need to suppress the chirps that upgrade the link to high-speed, if that’s an issue). A transparent hub is only required if intervention in the traffic is desired, on top of sniffing capabilities.

Requirements

Full operation is required, including connect / disconnect, suspend and resume. It may appear redundant to support suspend on a system that is never turned off or hibernated, but I’ve seen Linux suspending USB hubs when nothing is connected to its downstream ports. Once something is connected to one of the hub’s ports, it issues a resume (long K, see below) signal towards the host, and the host replies with a long K resume signal before SOF packets are sent. Besides, Windows supports a Selective Suspend feature which may suspend certain idle devices to save energy. These are just examples.
Both low and full speed must be supported. The logic will not be able to deduce which speed is in effect based upon pullup resistors from the device.
Reason I: In the specific project, the board has a pullup resistor only on the D+ wire going to the host. The solution was to swap the D+/D- wires on both sides when a low-speed device is used. As a result, the D+ wire seen by the FPGA will always be pulled up on the device side in either case.
Reason II: If a (regular) hub is connected as a device to the transparent hub, it always presents itself as a full-speed device with the pullup resistor. Low speed data is sent just by reducing the rate in this case, in both directions (see “SE0 detection when a hub is connected as a device” below)
There will be no additional intermediate elements (i.e. hubs) between the connected machines, but there may be hubs inside those.
The PHY device on both ends is TI’s TUSB1106 USB transceiver.

Comparison with a regular hub

A transparent hub, like a regular one, is required to repeat a variety of signals going from one end to another. The main challenges are to keep track of which side drives the wires, and prevent accidental signal level transitions (noise and differential signaling imbalance) from generating errors.

The USB spec defines maximal delays and jitter for a hub, which are then used to calculate the overall link jitter and turnaround delay. Since the transparent hub is designed to be the only element between the host and device (more or less, as the machines may be internal hubs), the total jitter and delay may be generated by this single element (again, more or less).

As for jitter, the USB 1.1 spec section 7.1.15 says: “Data receivers are required to decode differential data transitions that occur in a window plus and minus a nominal quarter bit cell from the nominal (centered) data edge position”. This is indeed reflected in the total jitter allowed, as presented in tables 7-2 and 7-3 of the same document (20 ns for full speed).

The turnaround delay for low and full speed is defined in USB 1.1 spec, 7.1.19, defining the timeout between 16-18 bit times (applies to both speeds). This calculation takes 5 hubs into consideration, as well as the device’s delay — in our case, the single element may add a few bits times of delay (I never calculated the exact figure, as my implementation went way below this figure).

The following functionalities are required from a regular hub, but can be omitted from a transparent one:

Suspend: A hub is required to put itself into a low-power state (Suspend) in the event of no activity on the bus (like any USB device). This isn’t necessary, as the transparent hub doesn’t consume current from the USB wire.
Babble / Loss of Activity (LOA) detection: A regular hub must detect if a device transmits junk on the wires, and by doing so, preventing the other devices (those connected to the same hub) from communicating with the host. The hub should kick in if the device doesn’t release the bus with an EOP before the end of the USB frame (once in 1 ms). A transparent hub doesn’t need to take action, since there are no neighboring devices competing for access. If a device is babbling or otherwise holding the wires, the host will soon detect that and take action.
Frame tracking: A regular hub must lock on the USB frames (by detecting SOF packets and maintaining a timer). However the purpose of this lock is to detect babble. Hence this isn’t required either.
Reset: A regular hub generates the SE0 signal required to reset a device when it’s hot-plugged to it. The transparent hub merely enables the pullup resistor on the upstream port in response to a hotplug on its downstream port, letting the host issue the reset.

Possible bus events

This is a list of events that need proper relaying to the other side.

The repeater should expect the following events from the host:

Packet (low or full speed): transition to K followed by SYNC and packet data
Long SE0 for reset (2.5 us, even though it’s expected to be several ms)
Keep-alive for low-speed device, 2 low-speed bits of SE0
Resume signaling (long K), followed by a low-speed EOP
PRE packets. See below.

The repeater should expect the following events from the device:

Packet (low or full speed): transition to K followed by SYNC and packet data
Long SE0 (2 us) = disconnection
Resume signaling (long K), followed by release of the bus (no EOP nor driven J).

Note that since the transparent hub doesn’t keep track of frame timing, it doesn’t detect 3 ms of no traffic, and hence doesn’t know when Suspend is required. Therefore, Suspend and Resume signaling is treated like any other signals, and no timeout mechanism is applied on long K periods.

One thing that might be confusing is that when the upstream port is disconnected, it might catch noise, in particular from the electrical network. Both single ended inputs may change states at a rate of 50 or 100 Hz (assuming a 50 Hz network), since no termination is attached to either ports. This is a no-issue, as there’s nothing to disrupt at that point, but may be mistaken for a problem.

Voltage transitions

Like any USB element, a transparent hub must properly detect transitions between J, K, and SE0 correctly, avoiding false detections.

The TUSB1106 transceiver presents the D+/D- wire pairs’ state by virtue of three logic signals: RCV, VP and VM. RCV represents the differential voltage state (J or K, given the speed mode), while VP and VM represent the single-ended voltages. The purpose of VP and VM is detecting an SE0 state, in which case both are low. The other possibilities are either illegal (i.e. both high, as single-ended ’1′ isn’t allowed per USB spec) or redundant (if they are different, RCV should be used, as it measures the differential voltage, rather than two single-ended voltages, and is therefore more immune to noise).

Any USB receiver is in particular sensitive to transitions between J, K and SE0. For example, a transition from J to K after idling means SOP (Start of Packet) and immediately causes the hub repeater to drive the other side. Likewise, a transition into SE0 while a packet is transmitted signals the end of packet (EOP).

The timing of the transitions is crucial, as a DPLL is maintained on the receiving side to lock on the arriving bits. In short, detecting voltage transitions timely and correctly is crucial.

The main difficulty is that switching from J to K essentially involves one of the D+/D- wires going from low to high, and the other one from high to low. Somewhere in the middle, they might both be high or both low. And if both are low, an SE0 condition occurs.

The USB 1.1 spec section 7.1.4 says: “Both D+ and D- may temporarily be less than Vih(min) during differential signal transitions. This period can be up to 14ns (TFST) for full-speed transitions and up to 210ns (TLST) for low-speed transitions. Logic in the receiver must ensure that that this is not interpreted as an SE0.”

So a temporary “detection” of SE0 may be false, and this is discussed next. But before that, can a transition into J or K also be false? Let’s divided it into two cases:

A transition from J to K or vice versa: This is detected by a change in the RCV input. The TUSB1106′s datasheet promises no transition in RCV when going into SE0 in its Table 5. So if RCV toggles when the previous state was J or K, it’s not an SE0. We’ll have to trust TI’s transceiver on this. Besides, going from J or K necessarily involves both wires’ voltage changes, but going to SE0 requires only one wire to move. So a properly designed receiver will not toggle RCV before both wires have changed polarity within legal high/low ranges (did I say TI?). This doesn’t contradict that VP and VM may show an SE0 momentarily, as already mentioned.
A transition from SE0 to J. This is detected by one of VM or VP going high. This requires one of the wires to reach the high voltage level. This can, in principle, happen momentarily due to single-ended noise (SE0 can be 20 ms long), so requiring that the VM or VP signal remains high for say, 41 ns before declaring the exit of SE0 may be in place. As discussed below, the uses of SE0 are such that it’s never required to time the exit of this state accurately.
A transition from SE0 to K. Even though this kind of transition is mentioned in the USB 1.1 spec’s requirement for the hub’s repeater (e.g. see section 11.6.2.2), there is no situation in the USB protocol for such transition to occur. I’ve also set a trigger for such event, and tried several pieces of electronics, none of which generated such transition. Consequently, there’s no reason to handle this case.

Avoiding false SE0 detection

The SE0 state may appear in the following cases:

EOP in either direction, following a packet. The length of the SE0 is approximately 2 bit times (depending on the speed mode), but should be detected after 1 bit time. See “SE0 detection when a hub is connected as a device” below for a possible problem with this.
End of a resume signal from host (K state for > 20 ms): A low-speed EOP (two low-speed bit times of SE0).
A long SE0 from host (should be detected at 2.5 us, generated as 10 ms): Reset the device
A long SE0 from device (should be detected at 2.0 us): Disconnect
Keep-alive signaling from host (low-speed only): 2 low-speed bits times of SE0 followed by J.

As mentioned above, a false SE0 may be present for 14 ns on full-speed mode (17% of a bit time) and 210ns in low-speed mode (32% of a bit time). The difference is because the rise and fall times of the D+/D- depend on the speed mode.

Except for detecting EOP in full-speed mode, it’s fine to ignore any SE0 shorter than 210 ns, as all other possibilities are significantly longer.

For detecting EOP in full-speed mode, the 14ns lower limit is enforced instead. If the speed mode can’t be derived from the pullup resistors (as was my case), it’s known from the packet itself. For example, from timing the second transition in the packet’s SYNC pattern.

The truth is, that looking at the output of the TUSB1106 USB transceiver when connected to real devices, it seems like the transceiver was designed to allow its user to be unaware of these issues. Clearly, an effort has been made to avoid these false SE0s in its design.

The VP and VM ports tend to toggle together, and in some cases (depending what USB device is tested) VP and VM were both high for a few nanoseconds on J/K transitions. This was even more evident in low-speed mode. Most likely, seeing both low at any time is enough to detect an SE0, despite the precautions required in the spec.

Also, the RCV signal toggled somewhere in the middle between VP and VM moving, or along with one of these, but never when SE0 was about to appear. So the unaware user of this chip could simply consider any movement in RCV as a true data toggle, and if VM and VP are low, detect SE0 without fussing too much about it.

Or, one can add these mechanisms like I did, just in case.

Summary of J/K/SE0 transition detection

The precautious way:

From J or K: If RCV toggles, register the transition to the opposite J/K symbol immediately. Ignore further transitions for 41 ns (half a full-speed bit) to avoid false transitions due to noise (I bet this is unnecessary — I’ve never seen any glitches on the RCV signal).
If VP and VM are both zero for 210 ns (or 14 ns, if in the midst of a full-speed packet), register a transition to SE0.
From SE0: If either VM or VP are high for 41 ns, register a transition to J.

Lazy man’s implementation (will most likely work fine):

From J or K: If RCV toggles, register the transition to the opposite J/K symbol. If VM and VP are both zero, register an SE0 (immediately).
From SE0: If either VM or VP are high for 41 ns, register a transition to J (immediately).

PRE packets

Section 8.6.5 states that a PRE packet should be issued by the host when it needs to make a low-speed transaction, when there’s a hub in the middle. The special thing with the PRE packet is that it doesn’t end with an SE0. Rather, there are 4 full-speed bit times of “nothing happening”, which is followed by a low-speed SYNC and packet.

This requires some special handling of this case, which actually isn’t described further down this post (it’s a bit complicated, quite naturally).

One tricky thing is that the state of the wires is always K after the PRE packet. In fact, it’s always K after then PID, because the SYNC itself is like transmitting 0x80, and the PID 8 bits itself always contains an even number of zeros, as half of it is the NOT of the other. So it’s always an uneven number of J/K toggles (7 zeros from SYNC, an even number from the PID part), leaving the wires at K.

The USB standard is a bit unclear on what happens after the PRE PID has been transmitted, but it’s quite safe to conclude that it has to switch to J immediately after the PRE packet has completed. The hub is expected to start relaying to the low-speed downstream ports from that moment on (up to 4 full-speed bit times later), and if the upstream port stands at K when that happens, a false beginning of SYNC will be relayed.

And indeed, this is what I’ve observed with real hardware: Immediately after the PRE packet’s last bit, the wires switch to a J state.

Implementation

These are the implemented states of transparent hub. The rationale is explained after the short outline of the states.

Disconnected — this is the initial state, and is also possibly invoked by virtue of SE0 timers explained below. The D+/D- pullup resistor is disconnected at the upstream port (to the host), and none of the ports is driven. When a pullup resistor has been sensed for 2.5us on the downstream port, a matching pullup resistor is enabled at the upstream port. Then wait for the upstream port to change from SE0 to J, or it will be interpreted as an SE0 signal. Then switch to the Idle state. The pullup resistor remains enabled on all other states.
Idle — None of the ports is driven. If a ‘K’ is sensed on the upstream port, switch to the “Down” state. If ‘K’ is sensed on the downstream port, switch to the “Up” state. In both cases, this is a J-to-K transition, so it takes place immediately when RCV toggles. An SE0 condition on either port switches the state to Host_SE0 or Device_SE0, whichever applies.
Down — J/K symbols are repeated from the upstream to the downstream port. This state doesn’t handle SE0. If a ‘J’ symbol is sensed continuously for 8 bit times, go to “Idle”, which isn’t a normal packet termination, but a result of a missed SE0. “Bit times” means low-speed bit times, unless a full-speed sync transition was detected at the first K-to-J transition during this state. The EOP is handled by the Host_SE0 state.
Up — J/K symbols are repeated from the downstream to the upstream port. Exactly like “Down”, only in the opposite direction. Same 8 bit timeout mechanism for returning to “Idle”
Host_SE0 — This state is invoked from Idle or Down states when an SE0 is sensed on the upstream port (after waiting TLST or TFST to prevent a false SE0). The downstream port is driven with SE0. When the SE0 condition on the upstream port is exited, switch to Host_preidle.
Host_preidle — This state drives the downstream port with a J for a single bit’s time (depends on the speed mode) and then switches the state machine to Idle. This completes EOPs and other similar conditions.
Device_SE0 — This state is invoked from Idle or Down states when an SE0 is sensed on the downstream port. The upstream is driven with SE0. When the SE0 condition is exited, switch to Device_preidle.
Device_preidle — This state will drives the upstream port with a J for a single bit’s time (depends on the speed mode) and then switches the state machine to Device_preidle.

The states above are outlined for illustration. In a real-life implementation, it’s convenient to collapse Down, Host_SE0 and Host_preidle into a single state (an enriched “Down” state). These three states all drive the downstream port. Host_SE0 is basically Down, as it repeats the data from the upstream to the downstream port. The only special thing is that a preidle phase follows it. This SE0 to preidle sequence is easier to implement by adding state flags, rather than adding states.

Likewise, Up, Device_SE0 and Device_preidle can be collapsed into a single state.

SE0 counters

Two counters are maintained to keep track of how long an SE0 condition has been continuously active, one for the upstream port, and one for the downstream port. These counters are zeroed in two situations:

When one of VP or VM is high, indicating not being in an SE0 condition.
When the port is question is driven by the other port. For example, the counter for the downstream port is held at zero in the Host_SE0 state (or a reset signal from the host could have been mistaken for a disconnection). A other possible criterion is the PHY’s Output Enable signal.

These counters serve two purposes:

Detect non-false SE0: When the counter goes above the relevant threshold value, the SE0 condition is valid. The state switches to Host_SE0 or Device_SE0 (whichever applied) from Idle or Down/Up. This handles EOP.
Detect disconnection on the downstream port: When the counter reaches the value corresponding to 2.0 us, the state switches to Disconnected

Note that the transition to the *_SE0 states, and the way they are terminated by virtue of a single bit’s J, covers all of the protocol’s uses of the SE0 condition, including reset from host and keep-alives.

Determining the bit length

As the speed isn’t determined by pullup resistors, a detection mechanism is applied as follows:

A flag, “lowspeed” is maintained, to indicate the time of a USB bit. When ’1′, low-speed is assumed (one bit is ~666 ns), and when ’0′, full-speed (~ 83 ns).
The flag is set to ’1′ in the Idle state.
When one of the Up or Down states is invoked by a transition from J to K, a counter measures the time until the first transition back to J. If this occurs within 6.5 full-speed bit times (~542 ns), the flag is cleared to ’0′.
Following transitions (until the next Idle state) are ignored for this purpose.

Rationale: The only situation where full-speed timing is required is full-speed packets. Such begin with a SYNC word, which starts with a few J/K togglings on each bit.

The time threshold could have been set slightly above 83 ns, since the first toggle is expected on the bit following the J-to-K toggle that took the state machine out of Idle. However 6.5 full-speed bit’s time is better, as it gracefully handles the unlikely event that Idle would be invoked in the middle of a transmitted packet (due to a false SE0, for example). Since 6 bits is the maximal length of equal bits (before bit stuffing is employed), the bus must toggle after 6 bits. If it doesn’t after 6.5, it’s not full-speed.

The “lowspeed” flag correctly handles other bus events, such as keep-alive signaling, which consists of a two SE0 low-speed bits followed by a bit of J. Since the flag is set on Idle, it remains such when Host_SE0 is invoked, which will correctly terminate with a low-speed bit’s worth of J.

SE0 detection when a hub is connected as a device

There’s a slight twist when a (real) USB hub is connected as the device to the transparent hub, and a low-speed device is connected to one of the (real) hub’s downstream ports. Any (real) hub’s upstream port runs at full-speed (or higher), so the hub repeats the low-speed device’s signals on the upstream port, using full-speed’s rise/fall times, polarity and other attributes.

A practical implication is that the EOP may consist of a full-speed SE0, if it’s generated by the hub (see citations below). It’s also the hub’s responsibility to ensure that false SE0 time periods are limited to the full-speed spec (i.e. TFST rather than TLST). Hence for the specific case of detecting an EOP that was generated by the (real) hub for truncating a packet that went beyond the end of a frame, the method of detecting SE0 outlined above won’t work, because it will ignore a full-speed SE0 in the absence of a full-speed sync pattern on that packet.

This is however a rare corner case, which stems from a misbehaving device connected to the (real) hub.

Citing 11.8.4 of the USB spec:

“The upstream connection of a hub must always be a full-speed connection. [ ... ] When low-speed data is sent or received through a hub’s upstream connection, the signaling is full-speed even though the bit times are low-speed.”

and also:

“Hubs will propagate upstream-directed packets of any speed using full-speed signaling polarity and edge rates.”

and

“Although a low-speed device will send a low-speed EOP to properly terminate a packet, a hub may truncate a low-speed packet at the EOF1 point with a full-speed EOP. Thus, hubs must always be able to tear down connectivity in response to a full-speed EOP regardless of the data rate of the packet.

and finally,

“Because of the slow transitions on low-speed ports, when the D+ and D- signal lines are switching between the ‘J’ and ‘K’, they may both be below 2.0V for a period of time that is longer than a full-speed bit time. A hub must ensure that these slow transitions do not result in termination of connectivity and must not result in an SE0 being sent upstream.”

Misalignment of transitions to SE0

Preventing false SE0 comes with a price: The J/K transitions are repeated immediately to the opposite port, but transitions to SE0 must be delayed until they’re confirmed to be such. The spec relates to this issue in section 7.1.14 (Hub Signal Timings, also see Table 7-8): “The EOP must be propagated through a hub in the same way as the differential signaling. The propagation delay for sensing an SE0 must be no less than the greater of the J-to-K, or K-to-J differential data delay (to avoid truncating the last data bit in a packet), but not more than 15ns greater than the larger of these differential delays at full-speed and 200ns at low-speed (to prevent creating a bit stuff error at the end of the packet).”

It’s not clear how the 200 ns requirement can be met given the 210 ns it takes to detect SE0 properly on low-speed. But 10 ns in low-speed mode is probably not much to fuss about.

In theory, it would be possible to delay repeating of the J/K transitions to the opposite port with the relevant expected SE0 delay. This is however not feasible when the speed is deduced from the first SYNC transition (see “Determining the bit length” above).

The chosen solution was to have a digital delay line for all transitions, with the delay of 100 ns. This is equivalent to 1.25 bits delay at full-speed, and slightly longer than a full-speed hub’s delay according to table 7-8, so it’s acceptable. J/K transitions are always delayed by the full delay amount, but transitions to SE0 manipulate the segment in the delay line, so that the output SE0 starts in a time that corresponds to when the SE0 was first seen, not when it was confirmed, in full-speed mode. In low-speed mode, only 100 ns are compensated for, but that’s still within spec.

High-speed suppression

Since the transparent hub supports only USB 1.1, the transition into high speed (per USB 2.0) must not occur. This is guaranteed by the suggested implementation, since the mechanism for upgrading into high speed takes place during the long SE0 period that resets the device. Namely, the USB 2.0 spec section 7.1.7.5 states that the a high speed capable device should drive current into the D- wire during the initial reset, while the host drives an SE0 on the wires. This creates a chirp-K state on the wires, which is essentially a slightly negative differential voltage instead of more or less zero. This is the first stage of the transition into high speed.

But since the host has already initiated an SE0 state when the chirp-K arrives from the device, the transparent hub is in the Host_SE0 state, so the voltages at the device’s side are ignored at that time period. The chirp will have no significance and has no way to get passed on to the host. Hence the host will never sense anything special, and the device will give up the speed upgrade attempt.

Add a Comment

Next Post: Gtkwave notes

Previose Post: NXP / Freescale SDMA and the art of accessing peripheral registers

my tech blog

Popular Posts

Latest Posts

Archives