Setting up Si5324/Si5328 on Xilinx development boards

This post was written by eli on August 30, 2020
Posted Under: FPGA,GTX

General

These are my notes as I set up the jitter attenuator devices (Silicon Labs’ Si5324 and Si5328) on Xilinx development boards as clean clock sources for use with MGTs. As such, this post isn’t all that organized.

I suppose that the reason Xilinx put a jitter attenuator to drive the SFP+ related reference clock, rather than just a plain clock generator, is first and foremost because it can be used as a high-quality clock source anyhow. On top of that, it may sync on a reference clock that might be derived from some other source. In particular, this allows syncing the SFP+ module’s data rate with the data rate of an external data stream.

But it’s a bit of a headache to set up. But if you’re into FPGAs, you should be used to this already.

Jots on random topics

  • To get the register settings, run DSPLLsim, a tool issued by Silicon Labs (part of Precision Clock EVB Software). There is no need for any evaluation board to run this tool for the sake of obtaining register values, and neither is there any need to install the drivers it suggests to install during the software’s installation.
  • Note that there’s an outline for the settings to make with DSPLLsim in a separate section below, which somewhat overlaps these jots.
  • Enable Free Run Mode = the crystal, which is connected to XA/XB is routed to CKIN2. So there’s no need to feed the device with a clock, but instead set the reference clock to CKIN2. That works as a 114.285 MHz (± 20 ppm) reference clock on all Xilinx boards I’ve seen so far, including KC705, ZC706, KCU105, ZCU106, KCU116, VC707, VC709, VCU108 and VCU118). This is also the frequency recommended in the Silicon Lab’s docs for a crystal reference.
  • The clock outputs should be set to LVDS. This is how they’re treated on Xilinx’ boards.
  • f3 is the frequency at the input of the phase detector (i.e. the reference clock after division). The higher f3, the lower jitter.
  • The chip’s CMODE is tied to GND, so the interface is I2C.
  • RATE0 and RATE1 are set to MM by not connecting them, so the internal pull-up-and-down sets them to this mode. This setting requires a 3rd overtone crystal with a 114.285 MHz frequency, which is exactly what’s in the board (see Si53xx Reference Manual, Appendix A, Table 50).
  • The device’s I2C address depends on the A2-A0 pins, which are all tied to GND on the boards I’ve bothered to check. The 7-bit I2C address is { 4b’1101, A[2:0] }. Since A=0, it’s 7-bit address 0x68, or 0xD0 / 0xD1 in 8-bit representation.
  • Except for SCL, SDA, INT_C1B and RST_B, there are no control pins connected to the FPGA. Hence all other outputs can be configure to be off. RST_B is an input to reset the device. INT_C1B indicates problems with CKIN1 (only) which is useless in Free Run Mode, or may indicate an interrupt condition, which is more useful (but not really helpful due to the catch explained a few bullets down).
  • Hence be INT_PIN and CK1_BAD_PIN (bits 0 and 2 in register 20) should be ’1′, so that INT_C1B functions as an interrupt pin. The polarity of this pin is set by bit 0 of register 22 (INT_POL). Also set LOL_MSK to ’1′ (bit 0 of register 24), so that INT_C1B is asserted until the PLL is locked. This might take several seconds if the loop bandwidth is narrow (which is good for jitter), so the FPGA should monitor this pin and hold the CPLL in reset until the reference clock is stable.
  • There is a catch, though: The INT_C1B pin reflects the value of the internal register LOL_FLG (given the outlined register setting), which latches a Loss Of Lock, and resets it only when zero is written to bit 0 of register 132. Or so I understand from the datasheet’s description for register 132, saying “Held version of LOL_INT. [ ... ] Flag cleared by writing 0 to this bit.” Also from Si53xx Reference Manual, section 6.11.9: “Once an interrupt flag bit is set, it will remain high until the register location is written with a ’0′ to clear the flag”. Hence the FPGA must continuously write to this register, or INT_C1B will be asserted forever. This could have been avoided by sensing the LOL output, however it’s not connected to the FPGA.
  • Alternatively, one can poll the LOL_INT flag directly, as bit 0 in register 130. If the I2C bus is going to be continuously busy until lock is achieved, one might as well read the non-latched version directly.
  • To get an idea of lock times, take a look on Silicon Labs’ AN803, where lock times of minutes are discussed as a normal thing. Also the possibility that lock indicator (LOL) might go on and off while acquiring lock, in particular with old devices.
  • Even if the LOL signal wobbles, the output clock’s frequency is already correct, so it’s a matter of slight phase adjustments, made with a narrow bandwidth loop. So the clock is good enough to work with as soon as LOL has gone low for the first time, in particular in a case like mine, where the CPLL should be able to tolerate a SSC running between 0 and -5000 ppm at a 33 kHz rate.
  • It gets better: Before accessing any I2C device on the board, the I2C switch (PCA9548A) must be programmed to let the signal through to the jitter attenuator: It wakes up with all paths off. The I2C switch’ own 7-bit address is { 4b’1110, A[2:0] }, where A[2:0] are its address pins. Check the respective board’s user guide for the actual address that has been set up.
  • Checking the time to lock on an Si5324 with a the crystal as reference, it went all from 600 ms to 2000 ms. It seems like it depends on the temperature of something, the warmer the longer lock time: Turning off the board for a minute brings back the lower end of lock times. But I’ve also seen quite random lock times, so don’t rely on any figure nor that it’s shorter after powerup. Just check the LOL and be ready to wait.
  • As for Si5328, lock times are around 20 seconds (!). Changing the loop bandwidth (with BWSEL) didn’t make any dramatic difference.
  • I recommend looking at the last page of Silicon Lab’s guides for the “We make things simple” slogan.

Clock squelch during self calibration

According to the reference manual section 6.2.1, the user must set ICAL = 1 to initiate a self-calibration after writing new PLL settings with the I2C registers. This initiates a self calibration.

To avoid an output clock that is significantly off the desired frequency, SQ_ICAL (bit 4 in register 4) should be set to ’1′, and CKOUT_ALWAYS_ON (bit 5 in register 0) should be ’0′. “SQ_ICAL” stands for squelch the clock until ICAL is completed.

The documentation is vague on whether the clock is squelched until the internal calibration is done, or until the lock is complete (i.e. LOL goes low). The reference manual goes “if SQ_ICAL = 1, the output clocks are disabled during self-calibration and will appear after the self-calibration routine is completed”, so this sounds like it doesn’t wait for LOL to go low. However the truth table for CKOUT_ALWAYS_ON and SQ_ICAL (Table 28 in the reference manual, and it also appears in the data sheets) goes “CKOUT OFF until after the first successful ICAL (i.e., when LOL is low)”. Nice, huh?

And if it there is no clock when LOL is high, what if it wobbles a bit after the initial lock acquisition?

So I checked this up with a scope on an Si5324 locking on XA/XB, with SQ_ICAL=1 and CKOUT_ALWAYS_ON=0, and the output clock was squelched for about 400 μs, and then became active throughout 1.7 seconds of locking, during which LOL was low. So it seems like there’s no risk of the clock going on and off if LOL wobbles.

LOCKT

This parameter controls the lock detector, and wins the prize for the most confusing parameter.

It goes like this: The lock detector monitors the phase difference with the reference clock. If the phase difference looks fine for a certain period of time (determined by the LOCKT parameter), LOL goes low to indicate a successful lock. As a side note, since LOCKT is the time it takes to determine that nothing bad has happened with the phase, it’s the minimum amount of time that LOL will be high, as stated in the reference manual.

But then there’s Table 1 in AN803, showing how LOCKT influences the locking time. Like, really dramatically. From several seconds to a split second. How could a lock detector make such a difference? Note that LOCKT doesn’t have anything to do with the allowed phase difference, but just how long it needs to be fine until the lock detector is happy.

The answer lies in the fast lock mechanism (enabled by FAST_LOCK). It’s a wide bandwidth loop used to make the initial acquisition. When the lock detector indicates a lock, the loop is narrowed for the sake of achieving low jitter. So if the lock detector is too fast, switching to the low bandwidth loop occurs to early, and it will take a longer time to reach a proper lock. If it’s too slow (and hence fussy), it wastes time on the wide bandwidth phase.

In the end, the datasheet recommends a value of 0x1 (53 ms). This seems to be the correct fit-all balance.

Another thing that AN803 mentions briefly, is that LOL is forced high during the acquisition phase. Hence the lock detection, which switches the loop from acquisition (wide loop bandwidth) to low-jitter lock (narrow bandwidth) is invisible to the user, and for a good reason: The lock detector is likely to wobble at this initial stage, and there’s really no lock. The change made in January 2013, which is mentioned in this application note, was partly for producing a LOL signal that would make sense to the user.

DSPLLsim settings

Step by step for getting the register assignments to use with the device.

  • After starting DPLLsim, select “Create a new frequency plan with free running mode enabled”
  • In the next step, select the desired device (5328 or 5324).
  • In the following screen, set CKIN1 to 114.285 MHz and CKOUT1 to the desired output frequency (125 MHz in my case). Also set the XA-XB input to 114.285 MHz. Never mind that we’re not going to work with CKOUT1 — this is just for the sake of calculating the dividers’ values.
  • Click “Calculate ratio” and hope for some pretty figures.
  • The number of outputs can remain 2.
  • Click “Next”.
  • Now there’s the minimal f3 to set. 16 kHz is a bit too optimistic. Reducing it to 14 kHz was enough for the 125 MHz case. Click “Next” again.
  • Now a list of frequency plans appears. If there are several options, odds are that you want the one with the highest f3. If there’s just one, fine. Go for it: Select that line and click “Next” twice.
  • This brings us to the “Other NCn_LS Values” screen, which allows setting the output divisor of the second clock output. For the same clock on both outputs, click “Next”.
  • A summary of plan is given. Review and click “OK” to accept it. It will appear on the next window as well.

And this brings us to the complicated part: The actual register settings. This tool is intended to run against an evaluation module, and update its parameters from the computer. So some features won’t work, but that doesn’t matter: It’s the register’s values we’re after.

There are a few tabs:

  • “General” tab: The defaults are OK: In particular, enable FASTLOCK, pick the narrowest bandwidth in range with BWSEL. The RATE attribute doesn’t make any difference (it’s set to MM by virtue of pins on the relevant devices). The Digital Hold feature is irrelevant, and so are its parameters (HIST_DEL and HIST_AVG). It’s only useful when the reference clock is shaky.
  • “Input Clocks” tab: Here we need changes. Set AUTOSEL_REG to 0 (“Manual”) and CKSEL_REG to 1 for selecting CKIN2 (which is where the crustal is routed to). CLKSEL_PIN set to 0, so that the selection is set by registers and not from any pins. We want to stick to the crystal, no matter what happens on the other clock input. Other than that, leave the defauls: CK_ACTV_POL doesn’t matter, because the pin is unused, and same goes for CS_CA. BYPASS_REG is correctly ’0′ (or the input clock goes directly to output). LOCKT should remain 1 and and VALT isn’t so important in a crystal reference scenario (they set the lock detector’s behavior). The FOS feature isn’t used, so its parameters are irrelevant as well. Automatic clock priority is irrelevant, as manual mode is applied. If you want to be extra pedantic, set CLKINnRATE to 0x3 (“95 to 215 MHz”), which is the expected frequency, at least on CKIN2. And set FOSREFSEL to 0 for XA/XB. It won’t make any difference, as the FOS detector’s output is ignored.
  • “Output Clocks” tab: Set the SFOUTn_REG to LVDS (0x7). Keep the defaults on the other settings. In particular, CKOUT_ALWAYS_ON unset and SQ_ICAL set ensures that a clock output is generated only after calibration. No hold logic is desired either.
  • “Status” tab: Start with unchecking everything, and set LOSn_EN to 0 (disable). Then check only INT_PIN, CK1_BAD_PIN and LOL_MSK. This is discussed in the jots above. In particular, it’s important that none of the other *_MSK are checked, or the interrupt pin will be contaminated by other alarms. Or it’s unimportant, if you’re going to ignore this pin altogether, and poll LOL through the I2C interface.

The other two tabs (“Register Map” and “Pin Out”) are just for information. No settings there.

So now, the finale: Write the settings to a file. On the main menu go to Options > Save Register Map File… and, well. For extra confusion, the addresses are written in decimal, but the values in hex.

The somewhat surprising result is that the register setting for Si5324 and Si5328 are identical, even though the loop bandwidth is dramatically different: Even though both have BWSEL_REG set to 0x3, which is the narrowest bandwidth possible for either device, for Si5324 it means 8 Hz, and for Si5328 it’s 0.086 Hz. So completely different jitter performance as well as lock time (much slower on Si5328) are expected despite, once again, exactly the same register setting.

There is a slight mistake in the register map: There is no need to write to the register at address 131, as it contains statuses. And if anything, write zeroes to bits [2:0], so that these flags are maybe cleared, and surely not ones. The value in the register map is just the default value.

Same goes for register 132, but it’s actually useful to write to it for the sake of clearing LOL_FLG. Only the value in the register map is 0x02, which doesn’t clear this flag (and once again, it’s simply the default).

It’s worth to note that the last assignment in the register map is writing 0x40 to address 136, which initiates self calibration.

I2C interface

Silicon Lab’s documentation on the I2C interface is rather minimal (see Si53xx Reference Manual, section 6.13), and refers to the standard for the specifics (with a dead link). However quite obviously, the devices work with the common setting for register access with a single byte address: First byte is the I2C address, second is the register address, and the third is the data to be written. All acknowledged.

For a read access, it’s also the common method of a two-byte write sequence to set the register address, a restart (no stop condition) and one byte for the I2C address, and the second reads data.

If there’s more than one byte of data, the address is auto-incremented, pretty much as usual.

According to the data sheet, the RST pin must be asserted (held low) at least 1 μs, after which the I2C interface can be used no sooner than 10 ms (tREADY in the datasheets). It’s also possible to use the I2C interface to induce a reset with the RST_REG bit of register 136. The device performs a power-up reset after powering up (section 5.10 in the manual) however resetting is a good idea to clear previous register settings.

As for the I2C mux, there are two devices used in Xilinx’ boards: TCA9548A and PCA9548A, which are functionally equivalent but with different voltage ranges, according to TI: “Same functionality and pinout but is not an equivalent to the compared device”.

The I2C interface of the I2C mux consists of a single register. Hence there is no address byte. It’s just the address of the device followed by the setting of the register. Two bytes, old school style. Bit n corresponds to channel n. Setting a bit to ’1′ enables the channel, and setting it to ’0′ disconnects it. Plain and simple.

The I2C mux has a reset pin which must be high during operation. On KCU105, it’s connected to the FPGA with a pull-up resistor, so it’s not clear why UG917 says it “must be driven High”, but it clearly mustn’t be driven low. Also on KCU105, the I2C bus is shared between the FPGA and a “System Controller”, which is a small Zynq device with some secret sauce software on it. So there are two masters on a single I2C bus by plain wiring.

According to UG917′s Appendix C, the system controller does some initialization on power-up, hence generating traffic on the I2C bus. Even though it’s not said explicitly, it’s plausible to assume that it’s engineered to finish its business before the FPGA has had a chance to load its bitstream and try its own stunts. Unless the user requests some operations via the UART interface with the controller, but that’s a different story.

Reader Comments

Hi Eli,

I’m just starting to look into the KCU116 and this blog is very useful since we will be needing the jitter atten provided by the 5328. Thanks so much for publishing it!

One question I had was: Once one uses the DSPLLsim tool to generate the reg settings, does Xilinx provide a way to read that file and configure the silabs part on the KCU116 board? Some sort of UI backdoor mode? Or are there existing software drivers for this part?

Thanks for any pointers you can provide that could help me get up and running quickly.

Thanks in advance,
Bob

#1 
Written By Bob Servilio on January 4th, 2021 @ 18:13

Hello,

That’s actually a question to Xilinx. As for myself, I wrote that logic from scratch.

#2 
Written By eli on January 4th, 2021 @ 18:36

Hi Eli,
I’ve just started looking into VCU118 and as said above this blog is very useful because we will need Si5328 to eliminate jitter. Thank you so much for Posting! A problem I have encountered is that I do not know much about IIC communication, and my knowledge of controlling the Si5328 chip is also limited. Now I am planning to write the register map table generated by DSPLLsim software into FPGA to realize the communication between FPGA and the anti-shake chip, that is, the configuration process of si5328. However, there are difficulties in the program writing stage. Could you please give me some Pointers on program writing?

Thank you for any tips that can help me get up and running quickly.

Thanks in advance,

#3 
Written By Epiphany on September 20th, 2023 @ 04:32

Add a Comment

required, use real name
required, will not be published
optional, your blog address