Displayport’s M and its Mvid fields: Timestamp or divider?
Introduction
The Displayport standard requires the transmission of fields named Mvid in different places of the main link stream, however the relevant parts in the standard are somewhat unclear. This is an attempt to understand the rationale behind the standard’s requirements, in the hope to clarify them.
This post is intended for someone who has spent some time reading the standard. The requirements aren’t listed here, but rather discussed.
Background
Among others, Displayport’s main link is designed to carry a DVI stream transparently as one of the use cases. That is, a DVI-to-Displayport transmitter, which has no particular knowledge of the DVI source, creates the Displayport stream, and a Displayport-to-DVI receiver reconstructs the DVI stream at the other end. This may seem a far-fetched scenario, but if a graphics card vendor wants to support Displayport on a card that only has DVI, a quick solution may be to add such a converter, possibly as a single chip. A monitor vendor may pick a similar solution to support Displayport as well.
Regardless, Displayport is specified to behave as if there was a pixel stream of data arriving with a pixel clock that is slower than the maximal throughput the Displayport link allows. The transmitter is required to pack the data in Transfer Units (TU) of 32-64 symbols length, and fill them partially with data, stuffing the rest of the Transfer Unit with zeros (between FS and FE symbols). The message is clear: Even if a Displayport-aware source has the capability to fetch pixels fast enough to send an entire line continuously on the Displayport link, it’s not the way to go. Instead, imagine that the pixels arrive at a slower pace, and fill each Transfer Unit with a constant number of pixels (plus minus one), so the sink isn’t required to handle pixels faster than necessary. Average the payload, rather than bombard the receiver.
Stream clock recovery
A trickier issue is the stream clock recovery (Stream Clock is the term used in the standard for the pixel clock used by the imaginary or actual DVI pixel stream). As the Displayport-to-DVI converter must present this stream clock on its output as the pixel clock, it needs to maintain some kind of clock PLL. The technically simpler case is when the symbol clock and stream clock are derived from the same reference clock, so the relation between their frequencies is a known constant rational number, which can be conveyed to the receiver. This is referred to as Synchronous Clock Mode in section 2.2.3 of the standard, and defines the PLL dividers M and N for achieving this as
f_stream_clock = f_link_symbol_clock * M / N
But Displayport needs to work when the stream clock is generated by an external source as well. In this Asynchronous Clock Mode the transmitter is required (per section 2.2.3) to measure the frequency approximately by counting the number of clock cycles of the stream clock during a period of 32768 symbol clocks. It should then announce N=32768 and M as the number of clocks counted in the MSA (Main Stream Attribute) packet, transmitted once on each vertical blank period. That makes sense: If the receiver locks a PLL to reconstruct the stream clock based upon M and N and the equation above, it will obtain the stream clock’s frequency within an error of 1/32768 ~ 30.5 ppm, more or less. This is of course unacceptable in the long run, but it’s still a rather small error: At the highest symbol clock of 540 MHz (for 5.4 Gb/s lanes), the 30.5 ppm inaccuracy of the measured M leads to a 16.5 kHz offset. So if the image’s line frequency is 16 kHz (lower than any VESA mode), it’s one pixel clock offset per line.
The Displayport standard allows the required fine-tuning of the receiver’s stream clock by requiring that the 8 LSBs of a stream clock timestamp (Mvid[7:0]) are transmitted at the end of each image line. In other words, the transmitter is required to maintain a free-running counter on its stream clock input, and send the lower bits of its value at the same moment a BS (Blanking Start) control symbol is transmitted on the link (BS marks the end of active pixels in a row, or as a keep-alive in the absence of video data).
The receiver may apply a counter on its own stream clock, and compare the 8 LSBs. As shown above, the difference should be one stream clock at most, so the receiver can fine-tune its PLL to obtain an accurate replica of the transmitter’s stream clock. As real-life clocks aren’t all that stable, and there’s also a chance that spread-spectrum modulation has been applied on the source stream clock, the difference can get bigger than one stream clock. So 8 bits of the timestamp seems to be a good choice.
So far so good. Now to the confusion of notations in the standard.
What’s M?
The main problem is that M is sometimes referred to as a timestamp, and sometimes as a PLL divider. There is a certain similarity, as PLL dividers are counters, and so are timestamps. The difference is that a PLL divider is zeroed when it reaches a certain value, and timestamps are not.
So section 2.2.3 begins with saying
The following equations conceptually explain how the Stream clock (Strm_Clk) must be derived from the Link Symbol clock (LS_Clk) using the Time Stamps, M and N
and uses M and N as dividers immediately after in the equation shown above. It also says a few rows down:
When in Asynchronous Clock mode, the DisplayPort uPacket TX must measure M using a counter running at the LS_Clk frequency as shown in Figure 2-17. The full counter value after every [N x LS_Clk cycles] must be transported in the DisplayPort Main Stream attributes. The least significant eight bits of M (Mvid[7:0]) must be transported once per main video stream horizontal period following BS and VB-ID.
which is a complete mixup. The counter runs on the stream clock and not LS_Clk. Also, [N x LS_Clk cycles] is announced in the MSA in the fields denoted Mvid23:0, Mvid16:8 and Mvid7:0, but these are surely not timestamps, but the result of the count. On the other hand, Mvid[7:0] is a timestamp, and can’t be reset every N symbol clock cycles, as it would be useless for small N’s. In fact, even if N=32768, it’s useless for a 540 MHz symbol clock: For a 33 kHz line frequency, a timestamp reset would occur every second line. So there are two different counters, one is the M divider, and the second the M timestamp, both referred to as Mvid in the standard.
This isn’t all that ambiguous in the Asynchronous Mode case, because the standard says what to do with Mvid in the MSA, and it’s quite obvious that the Mvid[7:0] transmitted along with a BS should be a timestamp that is never reset.
The problem is in Synchronous Mode. The standard doesn’t say what should be transmitted in the MSA. Section 2.2.4, which details the fields, says “M and N for main video stream clock recovery (24 bits each)” showing how the word is split into 3 symbols in drawings. And that’s it. Common sense says that they meant the M and N as PLL dividers. There’s no sense in sending N (denoted Nvid) otherwise. This makes these fields similar to the Asynchronous Mode, and it seems this is the widely accepted interpretation.
Nevertheless, someone out there might as well say the Mvid is Mvid, and it’s the full time stamp counter transmitted on the MSA. The receiver has no other way to know the full word otherwise. One may wonder why it would need it, but that’s a different story. But what is this part in section 2.2.3 good for then, if the full Mvid[23:0] word is never transmitted?
When Mvid7:0 crosses the 8-bit boundary, the entire Mvid23:0 will change. For example, when Mvid23:0 is 000FFFh at one point in time for a given main video stream, the value may turn to 0010000h at another point. The Sink device is responsible for determining the entire Mvid23:0 value based on the updated Mvid7:0.
Maybe the safe choice is to announce Asynchronous Mode regardless of whether the clock ratios are known, hoping that the monitor won’t mess up with the Mvid[7:0] timestamps.
Having said all this, one can speculate that these Mvid and Nvid fields are ignored anyhow by any monitor that has a good reason to support Displayport. Recall that the goal of all this was to reconstruct the stream clock, which doesn’t make sense when Displayport is used for resolutions that DVI can’t support.
Is Mvid[7:0] really required?
This isn’t really about what the standard requires, but the question is why.
Section 2.2.2.1, which details the control symbols for framing, says that BS should be, among others
Inserted at the same symbol time during vertical blanking period as during vertical display
That’s a somewhat odd requirement, as one can’t guarantee a repeated symbol time: In the general case, the streaming clock and symbol clock don’t divide, if they are synchronous at all, and hence the line period in terms of symbol clock can’t be constant.
Not being so picky, it’s clear that the standard requires that the BS is timed closely to some constant position in the originating image’s line. If it can’t hit exactly the same symbol position, move it by one. And since they mention a “same symbol time”, it means that all BS symbols are transmitted like this.
Which in turn means that the number of stream clocks between one BS and another is the total number of pixels per row in the originating image (active pixels + blanking). That number is known through the MSA. So why bother sending Mvid[7:0]? The difference is always the same.
Or maybe the meaning was just that BS has to be bit-aligned in a word the same way as the other symbols? After all, BS is coded as K28.5, which is commonly used as a “comma” symbol that marks the alignment of bits into 10-bit symbols on the wire. But with this interpretation, the requirement is trivial.
Reader Comments
Hi Eli,
Great blog, great article. What I find peculiar in this pixel packing approach, is that in the blanking regions we’re basically transmitting dummy symbols (not even packed in TUs if I am not mistaken) and the whole M_vid counting is rather theoretical / arbitrary there. For example, should we still “mimic” the stuffing or should we increment M on every link clock cycle?
Moreover, if I understand correctly, things need to be set such that at each scan-line the number of stream pixels will end before the link symbols so that some margin will be left for the asynchronous corrections. What’s the size of this margin? Where would it affect the receiver? (a large margin would mean that M_vid will not change over these intervals of every line – is that an issue?)
Perhaps I am confused about who this control-flow mechanism work in the async mode..
Thanks.
I’m sorry, but it was two years ago, and I’ve been lucky enough not to deal with Displayport since. So I don’t remember much about it at the moment.