Capture data at 250 MBytes/sec with Linux on Microblaze
The problem
The Xilinx Microblaze soft processor, which is implemented on the FPGA’s logic fabric, is indeed a stable and fully capable processor, but its rather low clock frequency — 70-100 MHz on a Spartan-6 — makes it a problematic candidate for data capture and frame grabbing.
When running Linux on Microblaze, the current kernel allows for a data rate of approximately 1 MByte/sec due to internal overhead. It appears like there’s a lack of optimization of the parts in the kernel copying data.
So while Linux on Microblaze is a great solution for making the FPGA talk with storage and network in a high-level manner, it suffers from a very slow I/O, rendering it useless for data capture to a network shared disk, for example.
How it’s tackled
Technically speaking, the solution is to capture data directly into the processor’s DDR memory using DMA. Since the 32-bit bus’ frequency is the same as the processor’s, even the lower end of 70 MHz allows for a theoretic throughput of 280 MBytes/sec. In practice, the Xillybus IP core has the proven capability of capturing data arriving at a continuous rate of 250 MBytes/sec, on bursts of 8 MBytes each.
Keep it simple
Another bonus with using Xillybus, is that the data is fed into a standard asynchronous FIFO on the FPGA. There is no need to interface with Microblaze’s buses, just connect data and read enable to a FIFO. The IP core supplies additional signals for synchronizing events with the processor, but their use is optional.
On the Linux side, it all boils down to opening a device file, reading data normally, and closing the file. The FPGA can signal EOF (end-of-file), so making a high speed data capture can be done from the shell prompt with the “cat” or “dd” commands. There’s no need to write complicated software nor a driver. Just a single standard UNIX command, and the data is stored in a regular disk file.
One thing to take into account, is that even though an 8 MBytes chunk of data is captured into the processor’s RAM in a split second, the I/O operation of copying it into some other media will typically take around 8 seconds. The memory access is fast, but the processor isn’t all that so.
A few technicalities
A working Linux distribution for Microblaze is available for download at Xillybus’ site. While this distribution has the Xillybus IP core and kernel driver included, that version captures data at the processor’s slow rates. For an evaluation kit supporting fast data capture, please contact Xillybus directly.
Another thing to mention is the reason for the 8 MBytes limit: The DDR memories come in larger sizes, but DMA memory is inherently within Linux kernel space. Allocating large physically continuous segments of RAM is difficult, and doing too well on that can make the entire system unstable.
There is a well-known workaround for this, though: It’s possible to give the kernel a boot parameter limiting the RAM it’s allowed to access. Using this simple trick, it’s possible to use the untouched chunk as a huge buffer. This requires a simple modification on the Xillybus driver. So it’s not so difficult to allow a capture segment of any size, as long as there’s enough RAM for both the buffer and the kernel itself.